Cargando…

Accounting for 16S rRNA copy number prediction uncertainty and its implications in bacterial diversity analyses

16S rRNA gene copy number (16S GCN) varies among bacterial species and this variation introduces potential biases to microbial diversity analyses using 16S rRNA read counts. To correct the biases, methods have been developed to predict 16S GCN. A recent study suggests that the prediction uncertainty...

Descripción completa

Detalles Bibliográficos
Autores principales: Gao, Yingnan, Wu, Martin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10257666/
https://www.ncbi.nlm.nih.gov/pubmed/37301942
http://dx.doi.org/10.1038/s43705-023-00266-0
_version_ 1785057348653416448
author Gao, Yingnan
Wu, Martin
author_facet Gao, Yingnan
Wu, Martin
author_sort Gao, Yingnan
collection PubMed
description 16S rRNA gene copy number (16S GCN) varies among bacterial species and this variation introduces potential biases to microbial diversity analyses using 16S rRNA read counts. To correct the biases, methods have been developed to predict 16S GCN. A recent study suggests that the prediction uncertainty can be so great that copy number correction is not justified in practice. Here we develop RasperGade16S, a novel method and software to better model and capture the inherent uncertainty in 16S GCN prediction. RasperGade16S implements a maximum likelihood framework of pulsed evolution model and explicitly accounts for intraspecific GCN variation and heterogeneous GCN evolution rates among species. Using cross-validation, we show that our method provides robust confidence estimates for the GCN predictions and outperforms other methods in both precision and recall. We have predicted GCN for 592605 OTUs in the SILVA database and tested 113842 bacterial communities that represent an exhaustive and diverse list of engineered and natural environments. We found that the prediction uncertainty is small enough for 99% of the communities that 16S GCN correction should improve their compositional and functional profiles estimated using 16S rRNA reads. On the other hand, we found that GCN variation has limited impacts on beta-diversity analyses such as PCoA, NMDS, PERMANOVA and random-forest test.
format Online
Article
Text
id pubmed-10257666
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-102576662023-06-12 Accounting for 16S rRNA copy number prediction uncertainty and its implications in bacterial diversity analyses Gao, Yingnan Wu, Martin ISME Commun Article 16S rRNA gene copy number (16S GCN) varies among bacterial species and this variation introduces potential biases to microbial diversity analyses using 16S rRNA read counts. To correct the biases, methods have been developed to predict 16S GCN. A recent study suggests that the prediction uncertainty can be so great that copy number correction is not justified in practice. Here we develop RasperGade16S, a novel method and software to better model and capture the inherent uncertainty in 16S GCN prediction. RasperGade16S implements a maximum likelihood framework of pulsed evolution model and explicitly accounts for intraspecific GCN variation and heterogeneous GCN evolution rates among species. Using cross-validation, we show that our method provides robust confidence estimates for the GCN predictions and outperforms other methods in both precision and recall. We have predicted GCN for 592605 OTUs in the SILVA database and tested 113842 bacterial communities that represent an exhaustive and diverse list of engineered and natural environments. We found that the prediction uncertainty is small enough for 99% of the communities that 16S GCN correction should improve their compositional and functional profiles estimated using 16S rRNA reads. On the other hand, we found that GCN variation has limited impacts on beta-diversity analyses such as PCoA, NMDS, PERMANOVA and random-forest test. Nature Publishing Group UK 2023-06-10 /pmc/articles/PMC10257666/ /pubmed/37301942 http://dx.doi.org/10.1038/s43705-023-00266-0 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Gao, Yingnan
Wu, Martin
Accounting for 16S rRNA copy number prediction uncertainty and its implications in bacterial diversity analyses
title Accounting for 16S rRNA copy number prediction uncertainty and its implications in bacterial diversity analyses
title_full Accounting for 16S rRNA copy number prediction uncertainty and its implications in bacterial diversity analyses
title_fullStr Accounting for 16S rRNA copy number prediction uncertainty and its implications in bacterial diversity analyses
title_full_unstemmed Accounting for 16S rRNA copy number prediction uncertainty and its implications in bacterial diversity analyses
title_short Accounting for 16S rRNA copy number prediction uncertainty and its implications in bacterial diversity analyses
title_sort accounting for 16s rrna copy number prediction uncertainty and its implications in bacterial diversity analyses
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10257666/
https://www.ncbi.nlm.nih.gov/pubmed/37301942
http://dx.doi.org/10.1038/s43705-023-00266-0
work_keys_str_mv AT gaoyingnan accountingfor16srrnacopynumberpredictionuncertaintyanditsimplicationsinbacterialdiversityanalyses
AT wumartin accountingfor16srrnacopynumberpredictionuncertaintyanditsimplicationsinbacterialdiversityanalyses