Cargando…
Beating Naive Bayes at Taxonomic Classification of 16S rRNA Gene Sequences
Naive Bayes classifiers (NBC) have dominated the field of taxonomic classification of amplicon sequences for over a decade. Apart from having runtime requirements that allow them to be trained and used on modest laptops, they have persistently provided class-topping classification accuracy. In this...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8249850/ https://www.ncbi.nlm.nih.gov/pubmed/34220738 http://dx.doi.org/10.3389/fmicb.2021.644487 |
_version_ | 1783716986367246336 |
---|---|
author | Ziemski, Michal Wisanwanichthan, Treepop Bokulich, Nicholas A. Kaehler, Benjamin D. |
author_facet | Ziemski, Michal Wisanwanichthan, Treepop Bokulich, Nicholas A. Kaehler, Benjamin D. |
author_sort | Ziemski, Michal |
collection | PubMed |
description | Naive Bayes classifiers (NBC) have dominated the field of taxonomic classification of amplicon sequences for over a decade. Apart from having runtime requirements that allow them to be trained and used on modest laptops, they have persistently provided class-topping classification accuracy. In this work we compare NBC with random forest classifiers, neural network classifiers, and a perfect classifier that can only fail when different species have identical sequences, and find that in some practical scenarios there is little scope for improving on NBC for taxonomic classification of 16S rRNA gene sequences. Further improvements in taxonomy classification are unlikely to come from novel algorithms alone, and will need to leverage other technological innovations, such as ecological frequency information. |
format | Online Article Text |
id | pubmed-8249850 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-82498502021-07-03 Beating Naive Bayes at Taxonomic Classification of 16S rRNA Gene Sequences Ziemski, Michal Wisanwanichthan, Treepop Bokulich, Nicholas A. Kaehler, Benjamin D. Front Microbiol Microbiology Naive Bayes classifiers (NBC) have dominated the field of taxonomic classification of amplicon sequences for over a decade. Apart from having runtime requirements that allow them to be trained and used on modest laptops, they have persistently provided class-topping classification accuracy. In this work we compare NBC with random forest classifiers, neural network classifiers, and a perfect classifier that can only fail when different species have identical sequences, and find that in some practical scenarios there is little scope for improving on NBC for taxonomic classification of 16S rRNA gene sequences. Further improvements in taxonomy classification are unlikely to come from novel algorithms alone, and will need to leverage other technological innovations, such as ecological frequency information. Frontiers Media S.A. 2021-06-18 /pmc/articles/PMC8249850/ /pubmed/34220738 http://dx.doi.org/10.3389/fmicb.2021.644487 Text en Copyright © 2021 Ziemski, Wisanwanichthan, Bokulich and Kaehler. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Microbiology Ziemski, Michal Wisanwanichthan, Treepop Bokulich, Nicholas A. Kaehler, Benjamin D. Beating Naive Bayes at Taxonomic Classification of 16S rRNA Gene Sequences |
title | Beating Naive Bayes at Taxonomic Classification of 16S rRNA Gene Sequences |
title_full | Beating Naive Bayes at Taxonomic Classification of 16S rRNA Gene Sequences |
title_fullStr | Beating Naive Bayes at Taxonomic Classification of 16S rRNA Gene Sequences |
title_full_unstemmed | Beating Naive Bayes at Taxonomic Classification of 16S rRNA Gene Sequences |
title_short | Beating Naive Bayes at Taxonomic Classification of 16S rRNA Gene Sequences |
title_sort | beating naive bayes at taxonomic classification of 16s rrna gene sequences |
topic | Microbiology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8249850/ https://www.ncbi.nlm.nih.gov/pubmed/34220738 http://dx.doi.org/10.3389/fmicb.2021.644487 |
work_keys_str_mv | AT ziemskimichal beatingnaivebayesattaxonomicclassificationof16srrnagenesequences AT wisanwanichthantreepop beatingnaivebayesattaxonomicclassificationof16srrnagenesequences AT bokulichnicholasa beatingnaivebayesattaxonomicclassificationof16srrnagenesequences AT kaehlerbenjamind beatingnaivebayesattaxonomicclassificationof16srrnagenesequences |