Cargando…
Accounting for 16S rRNA copy number prediction uncertainty and its implications in bacterial diversity analyses
16S rRNA gene copy number (16S GCN) varies among bacterial species and this variation introduces potential biases to microbial diversity analyses using 16S rRNA read counts. To correct the biases, methods have been developed to predict 16S GCN. A recent study suggests that the prediction uncertainty...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10257666/ https://www.ncbi.nlm.nih.gov/pubmed/37301942 http://dx.doi.org/10.1038/s43705-023-00266-0 |
_version_ | 1785057348653416448 |
---|---|
author | Gao, Yingnan Wu, Martin |
author_facet | Gao, Yingnan Wu, Martin |
author_sort | Gao, Yingnan |
collection | PubMed |
description | 16S rRNA gene copy number (16S GCN) varies among bacterial species and this variation introduces potential biases to microbial diversity analyses using 16S rRNA read counts. To correct the biases, methods have been developed to predict 16S GCN. A recent study suggests that the prediction uncertainty can be so great that copy number correction is not justified in practice. Here we develop RasperGade16S, a novel method and software to better model and capture the inherent uncertainty in 16S GCN prediction. RasperGade16S implements a maximum likelihood framework of pulsed evolution model and explicitly accounts for intraspecific GCN variation and heterogeneous GCN evolution rates among species. Using cross-validation, we show that our method provides robust confidence estimates for the GCN predictions and outperforms other methods in both precision and recall. We have predicted GCN for 592605 OTUs in the SILVA database and tested 113842 bacterial communities that represent an exhaustive and diverse list of engineered and natural environments. We found that the prediction uncertainty is small enough for 99% of the communities that 16S GCN correction should improve their compositional and functional profiles estimated using 16S rRNA reads. On the other hand, we found that GCN variation has limited impacts on beta-diversity analyses such as PCoA, NMDS, PERMANOVA and random-forest test. |
format | Online Article Text |
id | pubmed-10257666 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-102576662023-06-12 Accounting for 16S rRNA copy number prediction uncertainty and its implications in bacterial diversity analyses Gao, Yingnan Wu, Martin ISME Commun Article 16S rRNA gene copy number (16S GCN) varies among bacterial species and this variation introduces potential biases to microbial diversity analyses using 16S rRNA read counts. To correct the biases, methods have been developed to predict 16S GCN. A recent study suggests that the prediction uncertainty can be so great that copy number correction is not justified in practice. Here we develop RasperGade16S, a novel method and software to better model and capture the inherent uncertainty in 16S GCN prediction. RasperGade16S implements a maximum likelihood framework of pulsed evolution model and explicitly accounts for intraspecific GCN variation and heterogeneous GCN evolution rates among species. Using cross-validation, we show that our method provides robust confidence estimates for the GCN predictions and outperforms other methods in both precision and recall. We have predicted GCN for 592605 OTUs in the SILVA database and tested 113842 bacterial communities that represent an exhaustive and diverse list of engineered and natural environments. We found that the prediction uncertainty is small enough for 99% of the communities that 16S GCN correction should improve their compositional and functional profiles estimated using 16S rRNA reads. On the other hand, we found that GCN variation has limited impacts on beta-diversity analyses such as PCoA, NMDS, PERMANOVA and random-forest test. Nature Publishing Group UK 2023-06-10 /pmc/articles/PMC10257666/ /pubmed/37301942 http://dx.doi.org/10.1038/s43705-023-00266-0 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Gao, Yingnan Wu, Martin Accounting for 16S rRNA copy number prediction uncertainty and its implications in bacterial diversity analyses |
title | Accounting for 16S rRNA copy number prediction uncertainty and its implications in bacterial diversity analyses |
title_full | Accounting for 16S rRNA copy number prediction uncertainty and its implications in bacterial diversity analyses |
title_fullStr | Accounting for 16S rRNA copy number prediction uncertainty and its implications in bacterial diversity analyses |
title_full_unstemmed | Accounting for 16S rRNA copy number prediction uncertainty and its implications in bacterial diversity analyses |
title_short | Accounting for 16S rRNA copy number prediction uncertainty and its implications in bacterial diversity analyses |
title_sort | accounting for 16s rrna copy number prediction uncertainty and its implications in bacterial diversity analyses |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10257666/ https://www.ncbi.nlm.nih.gov/pubmed/37301942 http://dx.doi.org/10.1038/s43705-023-00266-0 |
work_keys_str_mv | AT gaoyingnan accountingfor16srrnacopynumberpredictionuncertaintyanditsimplicationsinbacterialdiversityanalyses AT wumartin accountingfor16srrnacopynumberpredictionuncertaintyanditsimplicationsinbacterialdiversityanalyses |