Cargando…

Nitrogen Cycling Microbial Diversity and Operational Taxonomic Unit Clustering: When to Prioritize Accuracy Over Speed

BACKGROUND: Assessments of the soil microbiome provide valuable insight to ecosystem function due to the integral role microorganisms play in biogeochemical cycling of carbon and nutrients. For example, treatment effects on nitrogen cycling functional groups are often presented alongside one another...

Descripción completa

Detalles Bibliográficos
Autores principales: Egenriether, Sada, Sanford, Robert, Yang, Wendy H., Kent, Angela D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9201982/
https://www.ncbi.nlm.nih.gov/pubmed/35722279
http://dx.doi.org/10.3389/fmicb.2022.730340
_version_ 1784728435065618432
author Egenriether, Sada
Sanford, Robert
Yang, Wendy H.
Kent, Angela D.
author_facet Egenriether, Sada
Sanford, Robert
Yang, Wendy H.
Kent, Angela D.
author_sort Egenriether, Sada
collection PubMed
description BACKGROUND: Assessments of the soil microbiome provide valuable insight to ecosystem function due to the integral role microorganisms play in biogeochemical cycling of carbon and nutrients. For example, treatment effects on nitrogen cycling functional groups are often presented alongside one another to demonstrate how agricultural management practices affect various nitrogen cycling processes. However, the functional groups commonly evaluated in nitrogen cycling microbiome studies range from phylogenetically narrow (e.g., N-fixation, nitrification) to broad [e.g., denitrification, dissimilatory nitrate reduction to ammonium (DNRA)]. The bioinformatics methods used in such studies were developed for 16S rRNA gene sequence data, and how these tools perform across functional genes of different phylogenetic diversity has not been established. For example, an OTU clustering method that can accurately characterize sequences harboring comparatively little diversity may not accurately resolve the diversity within a gene comprised of a large number of clades. This study uses two nitrogen cycling genes, nifH, a gene which segregates into only three distinct clades, and nrfA, a gene which is comprised of at least eighteen clades, to investigate differences which may arise when using heuristic OTU clustering (abundance-based greedy clustering, AGC) vs. true hierarchical OTU clustering (Matthews Correlation Coefficient optimizing algorithm, Opti-MCC). Detection of treatment differences for each gene were evaluated to demonstrate how conclusions drawn from a given dataset may differ depending on clustering method used. RESULTS: The heuristic and hierarchical methods performed comparably for the more conserved gene, nifH. The hierarchical method outperformed the heuristic method for the more diverse gene, nrfA; this included both the ability to detect treatment differences using PERMANOVA, as well as higher resolution in taxonomic classification. The difference in performance between the two methods may be traced to the AGC method’s preferential assignment of sequences to the most abundant OTUs: when analysis was limited to only the largest 100 OTUs, results from the AGC-assembled OTU table more closely resembled those of the Opti-MCC OTU table. Additionally, both AGC and Opti-MCC OTU tables detected comparable treatment differences using the rank-based ANOSIM test. This demonstrates that treatment differences were preserved using both clustering methods but were structured differently within the OTU tables produced using each method. CONCLUSION: For questions which can be answered using tests agnostic to clustering method (e.g., ANOSIM), or for genes of relatively low phylogenetic diversity (e.g., nifH), most upstream processing methods should lead to similar conclusions from downstream analyses. For studies involving more diverse genes, however, care should be exercised to choose methods that ensure accurate clustering for all genes. This will mitigate the risk of introducing Type II errors by allowing for detection of comparable treatment differences for all genes assessed, rather than disproportionately detecting treatment differences in only low-diversity genes.
format Online
Article
Text
id pubmed-9201982
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-92019822022-06-17 Nitrogen Cycling Microbial Diversity and Operational Taxonomic Unit Clustering: When to Prioritize Accuracy Over Speed Egenriether, Sada Sanford, Robert Yang, Wendy H. Kent, Angela D. Front Microbiol Microbiology BACKGROUND: Assessments of the soil microbiome provide valuable insight to ecosystem function due to the integral role microorganisms play in biogeochemical cycling of carbon and nutrients. For example, treatment effects on nitrogen cycling functional groups are often presented alongside one another to demonstrate how agricultural management practices affect various nitrogen cycling processes. However, the functional groups commonly evaluated in nitrogen cycling microbiome studies range from phylogenetically narrow (e.g., N-fixation, nitrification) to broad [e.g., denitrification, dissimilatory nitrate reduction to ammonium (DNRA)]. The bioinformatics methods used in such studies were developed for 16S rRNA gene sequence data, and how these tools perform across functional genes of different phylogenetic diversity has not been established. For example, an OTU clustering method that can accurately characterize sequences harboring comparatively little diversity may not accurately resolve the diversity within a gene comprised of a large number of clades. This study uses two nitrogen cycling genes, nifH, a gene which segregates into only three distinct clades, and nrfA, a gene which is comprised of at least eighteen clades, to investigate differences which may arise when using heuristic OTU clustering (abundance-based greedy clustering, AGC) vs. true hierarchical OTU clustering (Matthews Correlation Coefficient optimizing algorithm, Opti-MCC). Detection of treatment differences for each gene were evaluated to demonstrate how conclusions drawn from a given dataset may differ depending on clustering method used. RESULTS: The heuristic and hierarchical methods performed comparably for the more conserved gene, nifH. The hierarchical method outperformed the heuristic method for the more diverse gene, nrfA; this included both the ability to detect treatment differences using PERMANOVA, as well as higher resolution in taxonomic classification. The difference in performance between the two methods may be traced to the AGC method’s preferential assignment of sequences to the most abundant OTUs: when analysis was limited to only the largest 100 OTUs, results from the AGC-assembled OTU table more closely resembled those of the Opti-MCC OTU table. Additionally, both AGC and Opti-MCC OTU tables detected comparable treatment differences using the rank-based ANOSIM test. This demonstrates that treatment differences were preserved using both clustering methods but were structured differently within the OTU tables produced using each method. CONCLUSION: For questions which can be answered using tests agnostic to clustering method (e.g., ANOSIM), or for genes of relatively low phylogenetic diversity (e.g., nifH), most upstream processing methods should lead to similar conclusions from downstream analyses. For studies involving more diverse genes, however, care should be exercised to choose methods that ensure accurate clustering for all genes. This will mitigate the risk of introducing Type II errors by allowing for detection of comparable treatment differences for all genes assessed, rather than disproportionately detecting treatment differences in only low-diversity genes. Frontiers Media S.A. 2022-05-26 /pmc/articles/PMC9201982/ /pubmed/35722279 http://dx.doi.org/10.3389/fmicb.2022.730340 Text en Copyright © 2022 Egenriether, Sanford, Yang and Kent. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Microbiology
Egenriether, Sada
Sanford, Robert
Yang, Wendy H.
Kent, Angela D.
Nitrogen Cycling Microbial Diversity and Operational Taxonomic Unit Clustering: When to Prioritize Accuracy Over Speed
title Nitrogen Cycling Microbial Diversity and Operational Taxonomic Unit Clustering: When to Prioritize Accuracy Over Speed
title_full Nitrogen Cycling Microbial Diversity and Operational Taxonomic Unit Clustering: When to Prioritize Accuracy Over Speed
title_fullStr Nitrogen Cycling Microbial Diversity and Operational Taxonomic Unit Clustering: When to Prioritize Accuracy Over Speed
title_full_unstemmed Nitrogen Cycling Microbial Diversity and Operational Taxonomic Unit Clustering: When to Prioritize Accuracy Over Speed
title_short Nitrogen Cycling Microbial Diversity and Operational Taxonomic Unit Clustering: When to Prioritize Accuracy Over Speed
title_sort nitrogen cycling microbial diversity and operational taxonomic unit clustering: when to prioritize accuracy over speed
topic Microbiology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9201982/
https://www.ncbi.nlm.nih.gov/pubmed/35722279
http://dx.doi.org/10.3389/fmicb.2022.730340
work_keys_str_mv AT egenriethersada nitrogencyclingmicrobialdiversityandoperationaltaxonomicunitclusteringwhentoprioritizeaccuracyoverspeed
AT sanfordrobert nitrogencyclingmicrobialdiversityandoperationaltaxonomicunitclusteringwhentoprioritizeaccuracyoverspeed
AT yangwendyh nitrogencyclingmicrobialdiversityandoperationaltaxonomicunitclusteringwhentoprioritizeaccuracyoverspeed
AT kentangelad nitrogencyclingmicrobialdiversityandoperationaltaxonomicunitclusteringwhentoprioritizeaccuracyoverspeed