Cargando…

Generation of Comprehensive Ecosystem-Specific Reference Databases with Species-Level Resolution by High-Throughput Full-Length 16S rRNA Gene Sequencing and Automated Taxonomy Assignment (AutoTax)

High-throughput 16S rRNA gene amplicon sequencing is an essential method for studying the diversity and dynamics of microbial communities. However, this method is presently hampered by the lack of high-identity reference sequences for many environmental microbes in the public 16S rRNA gene reference...

Descripción completa

Detalles Bibliográficos
Autores principales: Dueholm, Morten Simonsen, Andersen, Kasper Skytte, McIlroy, Simon Jon, Kristensen, Jannie Munk, Yashiro, Erika, Karst, Søren Michael, Albertsen, Mads, Nielsen, Per Halkjær
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Microbiology 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7512547/
https://www.ncbi.nlm.nih.gov/pubmed/32963001
http://dx.doi.org/10.1128/mBio.01557-20
_version_ 1783586183455965184
author Dueholm, Morten Simonsen
Andersen, Kasper Skytte
McIlroy, Simon Jon
Kristensen, Jannie Munk
Yashiro, Erika
Karst, Søren Michael
Albertsen, Mads
Nielsen, Per Halkjær
author_facet Dueholm, Morten Simonsen
Andersen, Kasper Skytte
McIlroy, Simon Jon
Kristensen, Jannie Munk
Yashiro, Erika
Karst, Søren Michael
Albertsen, Mads
Nielsen, Per Halkjær
author_sort Dueholm, Morten Simonsen
collection PubMed
description High-throughput 16S rRNA gene amplicon sequencing is an essential method for studying the diversity and dynamics of microbial communities. However, this method is presently hampered by the lack of high-identity reference sequences for many environmental microbes in the public 16S rRNA gene reference databases and by the absence of a systematic and comprehensive taxonomy for the uncultured majority. Here, we demonstrate how high-throughput synthetic long-read sequencing can be applied to create ecosystem-specific full-length 16S rRNA gene amplicon sequence variant (FL-ASV) resolved reference databases that include high-identity references (>98.7% identity) for nearly all abundant bacteria (>0.01% relative abundance) using Danish wastewater treatment systems and anaerobic digesters as an example. In addition, we introduce a novel sequence identity-based approach for automated taxonomy assignment (AutoTax) that provides a complete seven-rank taxonomy for all reference sequences, using the SILVA taxonomy as a backbone, with stable placeholder names for unclassified taxa. The FL-ASVs are perfectly suited for the evaluation of taxonomic resolution and bias associated with primers commonly used for amplicon sequencing, allowing researchers to choose those that are ideal for their ecosystem. Reference databases processed with AutoTax greatly improves the classification of short-read 16S rRNA ASVs at the genus- and species-level, compared with the commonly used universal reference databases. Importantly, the placeholder names provide a way to explore the unclassified environmental taxa at different taxonomic ranks, which in combination with in situ analyses can be used to uncover their ecological roles.
format Online
Article
Text
id pubmed-7512547
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher American Society for Microbiology
record_format MEDLINE/PubMed
spelling pubmed-75125472020-09-25 Generation of Comprehensive Ecosystem-Specific Reference Databases with Species-Level Resolution by High-Throughput Full-Length 16S rRNA Gene Sequencing and Automated Taxonomy Assignment (AutoTax) Dueholm, Morten Simonsen Andersen, Kasper Skytte McIlroy, Simon Jon Kristensen, Jannie Munk Yashiro, Erika Karst, Søren Michael Albertsen, Mads Nielsen, Per Halkjær mBio Research Article High-throughput 16S rRNA gene amplicon sequencing is an essential method for studying the diversity and dynamics of microbial communities. However, this method is presently hampered by the lack of high-identity reference sequences for many environmental microbes in the public 16S rRNA gene reference databases and by the absence of a systematic and comprehensive taxonomy for the uncultured majority. Here, we demonstrate how high-throughput synthetic long-read sequencing can be applied to create ecosystem-specific full-length 16S rRNA gene amplicon sequence variant (FL-ASV) resolved reference databases that include high-identity references (>98.7% identity) for nearly all abundant bacteria (>0.01% relative abundance) using Danish wastewater treatment systems and anaerobic digesters as an example. In addition, we introduce a novel sequence identity-based approach for automated taxonomy assignment (AutoTax) that provides a complete seven-rank taxonomy for all reference sequences, using the SILVA taxonomy as a backbone, with stable placeholder names for unclassified taxa. The FL-ASVs are perfectly suited for the evaluation of taxonomic resolution and bias associated with primers commonly used for amplicon sequencing, allowing researchers to choose those that are ideal for their ecosystem. Reference databases processed with AutoTax greatly improves the classification of short-read 16S rRNA ASVs at the genus- and species-level, compared with the commonly used universal reference databases. Importantly, the placeholder names provide a way to explore the unclassified environmental taxa at different taxonomic ranks, which in combination with in situ analyses can be used to uncover their ecological roles. American Society for Microbiology 2020-09-22 /pmc/articles/PMC7512547/ /pubmed/32963001 http://dx.doi.org/10.1128/mBio.01557-20 Text en Copyright © 2020 Dueholm et al. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Research Article
Dueholm, Morten Simonsen
Andersen, Kasper Skytte
McIlroy, Simon Jon
Kristensen, Jannie Munk
Yashiro, Erika
Karst, Søren Michael
Albertsen, Mads
Nielsen, Per Halkjær
Generation of Comprehensive Ecosystem-Specific Reference Databases with Species-Level Resolution by High-Throughput Full-Length 16S rRNA Gene Sequencing and Automated Taxonomy Assignment (AutoTax)
title Generation of Comprehensive Ecosystem-Specific Reference Databases with Species-Level Resolution by High-Throughput Full-Length 16S rRNA Gene Sequencing and Automated Taxonomy Assignment (AutoTax)
title_full Generation of Comprehensive Ecosystem-Specific Reference Databases with Species-Level Resolution by High-Throughput Full-Length 16S rRNA Gene Sequencing and Automated Taxonomy Assignment (AutoTax)
title_fullStr Generation of Comprehensive Ecosystem-Specific Reference Databases with Species-Level Resolution by High-Throughput Full-Length 16S rRNA Gene Sequencing and Automated Taxonomy Assignment (AutoTax)
title_full_unstemmed Generation of Comprehensive Ecosystem-Specific Reference Databases with Species-Level Resolution by High-Throughput Full-Length 16S rRNA Gene Sequencing and Automated Taxonomy Assignment (AutoTax)
title_short Generation of Comprehensive Ecosystem-Specific Reference Databases with Species-Level Resolution by High-Throughput Full-Length 16S rRNA Gene Sequencing and Automated Taxonomy Assignment (AutoTax)
title_sort generation of comprehensive ecosystem-specific reference databases with species-level resolution by high-throughput full-length 16s rrna gene sequencing and automated taxonomy assignment (autotax)
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7512547/
https://www.ncbi.nlm.nih.gov/pubmed/32963001
http://dx.doi.org/10.1128/mBio.01557-20
work_keys_str_mv AT dueholmmortensimonsen generationofcomprehensiveecosystemspecificreferencedatabaseswithspecieslevelresolutionbyhighthroughputfulllength16srrnagenesequencingandautomatedtaxonomyassignmentautotax
AT andersenkasperskytte generationofcomprehensiveecosystemspecificreferencedatabaseswithspecieslevelresolutionbyhighthroughputfulllength16srrnagenesequencingandautomatedtaxonomyassignmentautotax
AT mcilroysimonjon generationofcomprehensiveecosystemspecificreferencedatabaseswithspecieslevelresolutionbyhighthroughputfulllength16srrnagenesequencingandautomatedtaxonomyassignmentautotax
AT kristensenjanniemunk generationofcomprehensiveecosystemspecificreferencedatabaseswithspecieslevelresolutionbyhighthroughputfulllength16srrnagenesequencingandautomatedtaxonomyassignmentautotax
AT yashiroerika generationofcomprehensiveecosystemspecificreferencedatabaseswithspecieslevelresolutionbyhighthroughputfulllength16srrnagenesequencingandautomatedtaxonomyassignmentautotax
AT karstsørenmichael generationofcomprehensiveecosystemspecificreferencedatabaseswithspecieslevelresolutionbyhighthroughputfulllength16srrnagenesequencingandautomatedtaxonomyassignmentautotax
AT albertsenmads generationofcomprehensiveecosystemspecificreferencedatabaseswithspecieslevelresolutionbyhighthroughputfulllength16srrnagenesequencingandautomatedtaxonomyassignmentautotax
AT nielsenperhalkjær generationofcomprehensiveecosystemspecificreferencedatabaseswithspecieslevelresolutionbyhighthroughputfulllength16srrnagenesequencingandautomatedtaxonomyassignmentautotax