Cargando…

Rapid and accurate taxonomic classification of cpn60 amplicon sequence variants

The “universal target” region of the gene encoding the 60 kDa chaperonin protein (cpn60, also known as groEL or hsp60) is a proven sequence barcode for bacteria and a useful target for marker gene amplicon-based studies of complex microbial communities. To date, identification of cpn60 sequence vari...

Descripción completa

Detalles Bibliográficos
Autores principales: Ren, Qingyi, Hill, Janet E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10362019/
https://www.ncbi.nlm.nih.gov/pubmed/37479852
http://dx.doi.org/10.1038/s43705-023-00283-z
_version_ 1785076329294594048
author Ren, Qingyi
Hill, Janet E.
author_facet Ren, Qingyi
Hill, Janet E.
author_sort Ren, Qingyi
collection PubMed
description The “universal target” region of the gene encoding the 60 kDa chaperonin protein (cpn60, also known as groEL or hsp60) is a proven sequence barcode for bacteria and a useful target for marker gene amplicon-based studies of complex microbial communities. To date, identification of cpn60 sequence variants from microbiome studies has been accomplished by alignment of queries to a reference database. Naïve Bayesian classifiers offer an alternative identification method that provides variable rank classification and shorter analysis times. We curated a set of cpn60 barcode sequences to train the RDP classifier and tested its performance on data from previous human microbiome studies. Results showed that sequences accounting for 79%, 86% and 92% of the observations (read counts) in saliva, vagina and infant stool microbiome data sets were classified to the species rank. We also trained the QIIME 2 q2-feature-classifier on cpn60 sequence data and demonstrated that it gives results consistent with the standalone RDP classifier. Successful implementation of a naïve Bayesian classifier for cpn60 sequences will facilitate future microbiome studies and open opportunities to integrate cpn60 amplicon sequence identification into existing analysis pipelines.
format Online
Article
Text
id pubmed-10362019
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-103620192023-07-23 Rapid and accurate taxonomic classification of cpn60 amplicon sequence variants Ren, Qingyi Hill, Janet E. ISME Commun Article The “universal target” region of the gene encoding the 60 kDa chaperonin protein (cpn60, also known as groEL or hsp60) is a proven sequence barcode for bacteria and a useful target for marker gene amplicon-based studies of complex microbial communities. To date, identification of cpn60 sequence variants from microbiome studies has been accomplished by alignment of queries to a reference database. Naïve Bayesian classifiers offer an alternative identification method that provides variable rank classification and shorter analysis times. We curated a set of cpn60 barcode sequences to train the RDP classifier and tested its performance on data from previous human microbiome studies. Results showed that sequences accounting for 79%, 86% and 92% of the observations (read counts) in saliva, vagina and infant stool microbiome data sets were classified to the species rank. We also trained the QIIME 2 q2-feature-classifier on cpn60 sequence data and demonstrated that it gives results consistent with the standalone RDP classifier. Successful implementation of a naïve Bayesian classifier for cpn60 sequences will facilitate future microbiome studies and open opportunities to integrate cpn60 amplicon sequence identification into existing analysis pipelines. Nature Publishing Group UK 2023-07-21 /pmc/articles/PMC10362019/ /pubmed/37479852 http://dx.doi.org/10.1038/s43705-023-00283-z Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Ren, Qingyi
Hill, Janet E.
Rapid and accurate taxonomic classification of cpn60 amplicon sequence variants
title Rapid and accurate taxonomic classification of cpn60 amplicon sequence variants
title_full Rapid and accurate taxonomic classification of cpn60 amplicon sequence variants
title_fullStr Rapid and accurate taxonomic classification of cpn60 amplicon sequence variants
title_full_unstemmed Rapid and accurate taxonomic classification of cpn60 amplicon sequence variants
title_short Rapid and accurate taxonomic classification of cpn60 amplicon sequence variants
title_sort rapid and accurate taxonomic classification of cpn60 amplicon sequence variants
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10362019/
https://www.ncbi.nlm.nih.gov/pubmed/37479852
http://dx.doi.org/10.1038/s43705-023-00283-z
work_keys_str_mv AT renqingyi rapidandaccuratetaxonomicclassificationofcpn60ampliconsequencevariants
AT hilljanete rapidandaccuratetaxonomicclassificationofcpn60ampliconsequencevariants