Cargando…

Ecological Observations Based on Functional Gene Sequencing Are Sensitive to the Amplicon Processing Method

Until recently, the de facto method for short-read-based amplicon reconstruction was a sequence similarity threshold approach (operational taxonomic units [OTUs]). This has changed with the amplicon sequence variant (ASV) method where distributions are fitted to abundance profiles of individual gene...

Descripción completa

Detalles Bibliográficos
Autores principales: Cholet, Fabien, Lisik, Agata, Agogué, Hélène, Ijaz, Umer Z., Pineau, Philippe, Lachaussée, Nicolas, Smith, Cindy J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Microbiology 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9429940/
https://www.ncbi.nlm.nih.gov/pubmed/35938727
http://dx.doi.org/10.1128/msphere.00324-22
Descripción
Sumario:Until recently, the de facto method for short-read-based amplicon reconstruction was a sequence similarity threshold approach (operational taxonomic units [OTUs]). This has changed with the amplicon sequence variant (ASV) method where distributions are fitted to abundance profiles of individual genes using a noise-error model. While OTU-based approaches are still useful for 16S rRNA/18S rRNA genes, where thresholds of 97% to 99% are used, their use for functional genes is still debatable as there is no consensus on clustering thresholds. Here, we compare OTU- and ASV-based reconstruction approaches and taxonomy assignment methods, the naive Bayesian classifier (NBC) and Bayesian lowest common ancestor (BLCA) algorithm, using a functional gene data set from the microbial nitrogen-cycling community in the Brouage mudflat (France). A range of OTU similarity thresholds and ASVs were used to compare amoA (ammonia-oxidizing archaea [AOA] and ammonia-oxidizing bacteria [AOB]), nxrB, nirS, nirK, and nrfA communities between differing sedimentary structures. Significant effects of the sedimentary structure on weighted UniFrac (WUniFrac) distances were observed for AOA amoA when using ASVs, an OTU at a threshold of 97% sequence identity (OTU-97%), and OTU-85%; AOB amoA when using OTU-85%; and nirS when using ASV, OTU-90%, and OTU-85%. For AOB amoA, significant effects of the sedimentary structures on UniFrac distances were observed when using OTU-97% but not ASVs, and the inverse was found for nrfA. Interestingly, conclusions drawn for nirK and nxrB were consistent between amplicon reconstruction methods. We also show that when the sequences in the reference database are related to the environment in question, the BLCA algorithm leads to more phylogenetically relevant classifications. However, when the reference database contains sequences more dissimilar to the ones retrieved, the NBC obtains more information. IMPORTANCE Several analysis pipelines are available to microbial ecologists to process amplicon sequencing data, yet to date, there is no consensus as to the most appropriate method, and it becomes more difficult for genes that encode a specific function (functional genes). Standardized approaches need to be adopted to increase the reliability and reproducibility of environmental amplicon-sequencing-based data sets. In this paper, we argue that the recently developed ASV approach offers a better opportunity to achieve such standardization than OTUs for functional genes. We also propose a comprehensive framework for quality filtering of the sequencing reads based on protein sequence verification.