Cargando…
Generation of Comprehensive Ecosystem-Specific Reference Databases with Species-Level Resolution by High-Throughput Full-Length 16S rRNA Gene Sequencing and Automated Taxonomy Assignment (AutoTax)
High-throughput 16S rRNA gene amplicon sequencing is an essential method for studying the diversity and dynamics of microbial communities. However, this method is presently hampered by the lack of high-identity reference sequences for many environmental microbes in the public 16S rRNA gene reference...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Society for Microbiology
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7512547/ https://www.ncbi.nlm.nih.gov/pubmed/32963001 http://dx.doi.org/10.1128/mBio.01557-20 |
_version_ | 1783586183455965184 |
---|---|
author | Dueholm, Morten Simonsen Andersen, Kasper Skytte McIlroy, Simon Jon Kristensen, Jannie Munk Yashiro, Erika Karst, Søren Michael Albertsen, Mads Nielsen, Per Halkjær |
author_facet | Dueholm, Morten Simonsen Andersen, Kasper Skytte McIlroy, Simon Jon Kristensen, Jannie Munk Yashiro, Erika Karst, Søren Michael Albertsen, Mads Nielsen, Per Halkjær |
author_sort | Dueholm, Morten Simonsen |
collection | PubMed |
description | High-throughput 16S rRNA gene amplicon sequencing is an essential method for studying the diversity and dynamics of microbial communities. However, this method is presently hampered by the lack of high-identity reference sequences for many environmental microbes in the public 16S rRNA gene reference databases and by the absence of a systematic and comprehensive taxonomy for the uncultured majority. Here, we demonstrate how high-throughput synthetic long-read sequencing can be applied to create ecosystem-specific full-length 16S rRNA gene amplicon sequence variant (FL-ASV) resolved reference databases that include high-identity references (>98.7% identity) for nearly all abundant bacteria (>0.01% relative abundance) using Danish wastewater treatment systems and anaerobic digesters as an example. In addition, we introduce a novel sequence identity-based approach for automated taxonomy assignment (AutoTax) that provides a complete seven-rank taxonomy for all reference sequences, using the SILVA taxonomy as a backbone, with stable placeholder names for unclassified taxa. The FL-ASVs are perfectly suited for the evaluation of taxonomic resolution and bias associated with primers commonly used for amplicon sequencing, allowing researchers to choose those that are ideal for their ecosystem. Reference databases processed with AutoTax greatly improves the classification of short-read 16S rRNA ASVs at the genus- and species-level, compared with the commonly used universal reference databases. Importantly, the placeholder names provide a way to explore the unclassified environmental taxa at different taxonomic ranks, which in combination with in situ analyses can be used to uncover their ecological roles. |
format | Online Article Text |
id | pubmed-7512547 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | American Society for Microbiology |
record_format | MEDLINE/PubMed |
spelling | pubmed-75125472020-09-25 Generation of Comprehensive Ecosystem-Specific Reference Databases with Species-Level Resolution by High-Throughput Full-Length 16S rRNA Gene Sequencing and Automated Taxonomy Assignment (AutoTax) Dueholm, Morten Simonsen Andersen, Kasper Skytte McIlroy, Simon Jon Kristensen, Jannie Munk Yashiro, Erika Karst, Søren Michael Albertsen, Mads Nielsen, Per Halkjær mBio Research Article High-throughput 16S rRNA gene amplicon sequencing is an essential method for studying the diversity and dynamics of microbial communities. However, this method is presently hampered by the lack of high-identity reference sequences for many environmental microbes in the public 16S rRNA gene reference databases and by the absence of a systematic and comprehensive taxonomy for the uncultured majority. Here, we demonstrate how high-throughput synthetic long-read sequencing can be applied to create ecosystem-specific full-length 16S rRNA gene amplicon sequence variant (FL-ASV) resolved reference databases that include high-identity references (>98.7% identity) for nearly all abundant bacteria (>0.01% relative abundance) using Danish wastewater treatment systems and anaerobic digesters as an example. In addition, we introduce a novel sequence identity-based approach for automated taxonomy assignment (AutoTax) that provides a complete seven-rank taxonomy for all reference sequences, using the SILVA taxonomy as a backbone, with stable placeholder names for unclassified taxa. The FL-ASVs are perfectly suited for the evaluation of taxonomic resolution and bias associated with primers commonly used for amplicon sequencing, allowing researchers to choose those that are ideal for their ecosystem. Reference databases processed with AutoTax greatly improves the classification of short-read 16S rRNA ASVs at the genus- and species-level, compared with the commonly used universal reference databases. Importantly, the placeholder names provide a way to explore the unclassified environmental taxa at different taxonomic ranks, which in combination with in situ analyses can be used to uncover their ecological roles. American Society for Microbiology 2020-09-22 /pmc/articles/PMC7512547/ /pubmed/32963001 http://dx.doi.org/10.1128/mBio.01557-20 Text en Copyright © 2020 Dueholm et al. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Research Article Dueholm, Morten Simonsen Andersen, Kasper Skytte McIlroy, Simon Jon Kristensen, Jannie Munk Yashiro, Erika Karst, Søren Michael Albertsen, Mads Nielsen, Per Halkjær Generation of Comprehensive Ecosystem-Specific Reference Databases with Species-Level Resolution by High-Throughput Full-Length 16S rRNA Gene Sequencing and Automated Taxonomy Assignment (AutoTax) |
title | Generation of Comprehensive Ecosystem-Specific Reference Databases with Species-Level Resolution by High-Throughput Full-Length 16S rRNA Gene Sequencing and Automated Taxonomy Assignment (AutoTax) |
title_full | Generation of Comprehensive Ecosystem-Specific Reference Databases with Species-Level Resolution by High-Throughput Full-Length 16S rRNA Gene Sequencing and Automated Taxonomy Assignment (AutoTax) |
title_fullStr | Generation of Comprehensive Ecosystem-Specific Reference Databases with Species-Level Resolution by High-Throughput Full-Length 16S rRNA Gene Sequencing and Automated Taxonomy Assignment (AutoTax) |
title_full_unstemmed | Generation of Comprehensive Ecosystem-Specific Reference Databases with Species-Level Resolution by High-Throughput Full-Length 16S rRNA Gene Sequencing and Automated Taxonomy Assignment (AutoTax) |
title_short | Generation of Comprehensive Ecosystem-Specific Reference Databases with Species-Level Resolution by High-Throughput Full-Length 16S rRNA Gene Sequencing and Automated Taxonomy Assignment (AutoTax) |
title_sort | generation of comprehensive ecosystem-specific reference databases with species-level resolution by high-throughput full-length 16s rrna gene sequencing and automated taxonomy assignment (autotax) |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7512547/ https://www.ncbi.nlm.nih.gov/pubmed/32963001 http://dx.doi.org/10.1128/mBio.01557-20 |
work_keys_str_mv | AT dueholmmortensimonsen generationofcomprehensiveecosystemspecificreferencedatabaseswithspecieslevelresolutionbyhighthroughputfulllength16srrnagenesequencingandautomatedtaxonomyassignmentautotax AT andersenkasperskytte generationofcomprehensiveecosystemspecificreferencedatabaseswithspecieslevelresolutionbyhighthroughputfulllength16srrnagenesequencingandautomatedtaxonomyassignmentautotax AT mcilroysimonjon generationofcomprehensiveecosystemspecificreferencedatabaseswithspecieslevelresolutionbyhighthroughputfulllength16srrnagenesequencingandautomatedtaxonomyassignmentautotax AT kristensenjanniemunk generationofcomprehensiveecosystemspecificreferencedatabaseswithspecieslevelresolutionbyhighthroughputfulllength16srrnagenesequencingandautomatedtaxonomyassignmentautotax AT yashiroerika generationofcomprehensiveecosystemspecificreferencedatabaseswithspecieslevelresolutionbyhighthroughputfulllength16srrnagenesequencingandautomatedtaxonomyassignmentautotax AT karstsørenmichael generationofcomprehensiveecosystemspecificreferencedatabaseswithspecieslevelresolutionbyhighthroughputfulllength16srrnagenesequencingandautomatedtaxonomyassignmentautotax AT albertsenmads generationofcomprehensiveecosystemspecificreferencedatabaseswithspecieslevelresolutionbyhighthroughputfulllength16srrnagenesequencingandautomatedtaxonomyassignmentautotax AT nielsenperhalkjær generationofcomprehensiveecosystemspecificreferencedatabaseswithspecieslevelresolutionbyhighthroughputfulllength16srrnagenesequencingandautomatedtaxonomyassignmentautotax |