Cargando…
Incorporation of Data From Multiple Hypervariable Regions when Analyzing Bacterial 16S rRNA Gene Sequencing Data
Short read 16 S rRNA amplicon sequencing is a common technique used in microbiome research. However, inaccuracies in estimated bacterial community composition can occur due to amplification bias of the targeted hypervariable region. A potential solution is to sequence and assess multiple hypervariab...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9009396/ https://www.ncbi.nlm.nih.gov/pubmed/35432480 http://dx.doi.org/10.3389/fgene.2022.799615 |
_version_ | 1784687259174305792 |
---|---|
author | Jones, Carli B. White, James R. Ernst, Sarah E. Sfanos, Karen S. Peiffer, Lauren B. |
author_facet | Jones, Carli B. White, James R. Ernst, Sarah E. Sfanos, Karen S. Peiffer, Lauren B. |
author_sort | Jones, Carli B. |
collection | PubMed |
description | Short read 16 S rRNA amplicon sequencing is a common technique used in microbiome research. However, inaccuracies in estimated bacterial community composition can occur due to amplification bias of the targeted hypervariable region. A potential solution is to sequence and assess multiple hypervariable regions in tandem, yet there is currently no consensus as to the appropriate method for analyzing this data. Additionally, there are many sequence analysis resources for data produced from the Illumina platform, but fewer open-source options available for data from the Ion Torrent platform. Herein, we present an analysis pipeline using open-source analysis platforms that integrates data from multiple hypervariable regions and is compatible with data produced from the Ion Torrent platform. We used the ThermoFisher Ion 16 S Metagenomics Kit and a mock community of twenty bacterial strains to assess taxonomic classification of six amplicons from separate hypervariable regions (V2, V3, V4, V6-7, V8, V9) using our analysis pipeline. We report that different amplicons have different specificities for taxonomic classification, which also has implications for global level analyses such as alpha and beta diversity. Finally, we utilize a generalized linear modeling approach to statistically integrate the results from multiple hypervariable regions and apply this methodology to data from a representative clinical cohort. We conclude that examining sequencing results across multiple hypervariable regions provides more taxonomic information than sequencing across a single region. The data across multiple hypervariable regions can be combined using generalized linear models to enhance the statistical evaluation of overall differences in community structure and relatedness among sample groups. |
format | Online Article Text |
id | pubmed-9009396 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-90093962022-04-15 Incorporation of Data From Multiple Hypervariable Regions when Analyzing Bacterial 16S rRNA Gene Sequencing Data Jones, Carli B. White, James R. Ernst, Sarah E. Sfanos, Karen S. Peiffer, Lauren B. Front Genet Genetics Short read 16 S rRNA amplicon sequencing is a common technique used in microbiome research. However, inaccuracies in estimated bacterial community composition can occur due to amplification bias of the targeted hypervariable region. A potential solution is to sequence and assess multiple hypervariable regions in tandem, yet there is currently no consensus as to the appropriate method for analyzing this data. Additionally, there are many sequence analysis resources for data produced from the Illumina platform, but fewer open-source options available for data from the Ion Torrent platform. Herein, we present an analysis pipeline using open-source analysis platforms that integrates data from multiple hypervariable regions and is compatible with data produced from the Ion Torrent platform. We used the ThermoFisher Ion 16 S Metagenomics Kit and a mock community of twenty bacterial strains to assess taxonomic classification of six amplicons from separate hypervariable regions (V2, V3, V4, V6-7, V8, V9) using our analysis pipeline. We report that different amplicons have different specificities for taxonomic classification, which also has implications for global level analyses such as alpha and beta diversity. Finally, we utilize a generalized linear modeling approach to statistically integrate the results from multiple hypervariable regions and apply this methodology to data from a representative clinical cohort. We conclude that examining sequencing results across multiple hypervariable regions provides more taxonomic information than sequencing across a single region. The data across multiple hypervariable regions can be combined using generalized linear models to enhance the statistical evaluation of overall differences in community structure and relatedness among sample groups. Frontiers Media S.A. 2022-03-31 /pmc/articles/PMC9009396/ /pubmed/35432480 http://dx.doi.org/10.3389/fgene.2022.799615 Text en Copyright © 2022 Jones, White, Ernst, Sfanos and Peiffer. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Jones, Carli B. White, James R. Ernst, Sarah E. Sfanos, Karen S. Peiffer, Lauren B. Incorporation of Data From Multiple Hypervariable Regions when Analyzing Bacterial 16S rRNA Gene Sequencing Data |
title | Incorporation of Data From Multiple Hypervariable Regions when Analyzing Bacterial 16S rRNA Gene Sequencing Data |
title_full | Incorporation of Data From Multiple Hypervariable Regions when Analyzing Bacterial 16S rRNA Gene Sequencing Data |
title_fullStr | Incorporation of Data From Multiple Hypervariable Regions when Analyzing Bacterial 16S rRNA Gene Sequencing Data |
title_full_unstemmed | Incorporation of Data From Multiple Hypervariable Regions when Analyzing Bacterial 16S rRNA Gene Sequencing Data |
title_short | Incorporation of Data From Multiple Hypervariable Regions when Analyzing Bacterial 16S rRNA Gene Sequencing Data |
title_sort | incorporation of data from multiple hypervariable regions when analyzing bacterial 16s rrna gene sequencing data |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9009396/ https://www.ncbi.nlm.nih.gov/pubmed/35432480 http://dx.doi.org/10.3389/fgene.2022.799615 |
work_keys_str_mv | AT jonescarlib incorporationofdatafrommultiplehypervariableregionswhenanalyzingbacterial16srrnagenesequencingdata AT whitejamesr incorporationofdatafrommultiplehypervariableregionswhenanalyzingbacterial16srrnagenesequencingdata AT ernstsarahe incorporationofdatafrommultiplehypervariableregionswhenanalyzingbacterial16srrnagenesequencingdata AT sfanoskarens incorporationofdatafrommultiplehypervariableregionswhenanalyzingbacterial16srrnagenesequencingdata AT peifferlaurenb incorporationofdatafrommultiplehypervariableregionswhenanalyzingbacterial16srrnagenesequencingdata |