Cargando…
Ultrafast and accurate 16S rRNA microbial community analysis using Kraken 2
BACKGROUND: For decades, 16S ribosomal RNA sequencing has been the primary means for identifying the bacterial species present in a sample with unknown composition. One of the most widely used tools for this purpose today is the QIIME (Quantitative Insights Into Microbial Ecology) package. Recent re...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7455996/ https://www.ncbi.nlm.nih.gov/pubmed/32859275 http://dx.doi.org/10.1186/s40168-020-00900-2 |
_version_ | 1783575732955381760 |
---|---|
author | Lu, Jennifer Salzberg, Steven L. |
author_facet | Lu, Jennifer Salzberg, Steven L. |
author_sort | Lu, Jennifer |
collection | PubMed |
description | BACKGROUND: For decades, 16S ribosomal RNA sequencing has been the primary means for identifying the bacterial species present in a sample with unknown composition. One of the most widely used tools for this purpose today is the QIIME (Quantitative Insights Into Microbial Ecology) package. Recent results have shown that the newest release, QIIME 2, has higher accuracy than QIIME, MAPseq, and mothur when classifying bacterial genera from simulated human gut, ocean, and soil metagenomes, although QIIME 2 also proved to be the most computationally expensive. Kraken, first released in 2014, has been shown to provide exceptionally fast and accurate classification for shotgun metagenomics sequencing projects. Bracken, released in 2016, then provided users with the ability to accurately estimate species or genus relative abundances using Kraken classification results. Kraken 2, which matches the accuracy and speed of Kraken 1, now supports 16S rRNA databases, allowing for direct comparisons to QIIME and similar systems. METHODS: For a comprehensive assessment of each tool, we compare the computational resources and speed of QIIME 2’s q2-feature-classifier, Kraken 2, and Bracken in generating the three main 16S rRNA databases: Greengenes, SILVA, and RDP. For an evaluation of accuracy, we evaluated each tool using the same simulated 16S rRNA reads from human gut, ocean, and soil metagenomes that were previously used to compare QIIME, MAPseq, mothur, and QIIME 2. We evaluated accuracy based on the accuracy of the final genera read counts assigned by each tool. Finally, as Kraken 2 is the only tool providing per-read taxonomic assignments, we evaluate the sensitivity and precision of Kraken 2’s per-read classifications. RESULTS: For both the Greengenes and SILVA database, Kraken 2 and Bracken are up to 100 times faster at database generation. For classification, using the same data as previous studies, Kraken 2 and Bracken are up to 300 times faster, use 100x less RAM, and generate results that more accurate at 16S rRNA profiling than QIIME 2’s q2-feature-classifier. CONCLUSION: Kraken 2 and Bracken provide a very fast, efficient, and accurate solution for 16S rRNA metataxonomic data analysis. |
format | Online Article Text |
id | pubmed-7455996 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-74559962020-08-31 Ultrafast and accurate 16S rRNA microbial community analysis using Kraken 2 Lu, Jennifer Salzberg, Steven L. Microbiome Research BACKGROUND: For decades, 16S ribosomal RNA sequencing has been the primary means for identifying the bacterial species present in a sample with unknown composition. One of the most widely used tools for this purpose today is the QIIME (Quantitative Insights Into Microbial Ecology) package. Recent results have shown that the newest release, QIIME 2, has higher accuracy than QIIME, MAPseq, and mothur when classifying bacterial genera from simulated human gut, ocean, and soil metagenomes, although QIIME 2 also proved to be the most computationally expensive. Kraken, first released in 2014, has been shown to provide exceptionally fast and accurate classification for shotgun metagenomics sequencing projects. Bracken, released in 2016, then provided users with the ability to accurately estimate species or genus relative abundances using Kraken classification results. Kraken 2, which matches the accuracy and speed of Kraken 1, now supports 16S rRNA databases, allowing for direct comparisons to QIIME and similar systems. METHODS: For a comprehensive assessment of each tool, we compare the computational resources and speed of QIIME 2’s q2-feature-classifier, Kraken 2, and Bracken in generating the three main 16S rRNA databases: Greengenes, SILVA, and RDP. For an evaluation of accuracy, we evaluated each tool using the same simulated 16S rRNA reads from human gut, ocean, and soil metagenomes that were previously used to compare QIIME, MAPseq, mothur, and QIIME 2. We evaluated accuracy based on the accuracy of the final genera read counts assigned by each tool. Finally, as Kraken 2 is the only tool providing per-read taxonomic assignments, we evaluate the sensitivity and precision of Kraken 2’s per-read classifications. RESULTS: For both the Greengenes and SILVA database, Kraken 2 and Bracken are up to 100 times faster at database generation. For classification, using the same data as previous studies, Kraken 2 and Bracken are up to 300 times faster, use 100x less RAM, and generate results that more accurate at 16S rRNA profiling than QIIME 2’s q2-feature-classifier. CONCLUSION: Kraken 2 and Bracken provide a very fast, efficient, and accurate solution for 16S rRNA metataxonomic data analysis. BioMed Central 2020-08-28 /pmc/articles/PMC7455996/ /pubmed/32859275 http://dx.doi.org/10.1186/s40168-020-00900-2 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Lu, Jennifer Salzberg, Steven L. Ultrafast and accurate 16S rRNA microbial community analysis using Kraken 2 |
title | Ultrafast and accurate 16S rRNA microbial community analysis using Kraken 2 |
title_full | Ultrafast and accurate 16S rRNA microbial community analysis using Kraken 2 |
title_fullStr | Ultrafast and accurate 16S rRNA microbial community analysis using Kraken 2 |
title_full_unstemmed | Ultrafast and accurate 16S rRNA microbial community analysis using Kraken 2 |
title_short | Ultrafast and accurate 16S rRNA microbial community analysis using Kraken 2 |
title_sort | ultrafast and accurate 16s rrna microbial community analysis using kraken 2 |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7455996/ https://www.ncbi.nlm.nih.gov/pubmed/32859275 http://dx.doi.org/10.1186/s40168-020-00900-2 |
work_keys_str_mv | AT lujennifer ultrafastandaccurate16srrnamicrobialcommunityanalysisusingkraken2 AT salzbergstevenl ultrafastandaccurate16srrnamicrobialcommunityanalysisusingkraken2 |