Cargando…

Signal and noise in metabarcoding data

Metabarcoding is a powerful molecular tool for simultaneously surveying hundreds to thousands of species from a single sample, underpinning microbiome and environmental DNA (eDNA) methods. Deriving quantitative estimates of underlying biological communities from metabarcoding is critical for enhanci...

Descripción completa

Detalles Bibliográficos
Autores principales: Gold, Zachary, Shelton, Andrew Olaf, Casendino, Helen R., Duprey, Joe, Gallego, Ramón, Van Cise, Amy, Fisher, Mary, Jensen, Alexander J., D’Agnese, Erin, Andruszkiewicz Allan, Elizabeth, Ramón-Laca, Ana, Garber-Yonts, Maya, Labare, Michaela, Parsons, Kim M., Kelly, Ryan P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10174484/
https://www.ncbi.nlm.nih.gov/pubmed/37167310
http://dx.doi.org/10.1371/journal.pone.0285674
_version_ 1785040040361984000
author Gold, Zachary
Shelton, Andrew Olaf
Casendino, Helen R.
Duprey, Joe
Gallego, Ramón
Van Cise, Amy
Fisher, Mary
Jensen, Alexander J.
D’Agnese, Erin
Andruszkiewicz Allan, Elizabeth
Ramón-Laca, Ana
Garber-Yonts, Maya
Labare, Michaela
Parsons, Kim M.
Kelly, Ryan P.
author_facet Gold, Zachary
Shelton, Andrew Olaf
Casendino, Helen R.
Duprey, Joe
Gallego, Ramón
Van Cise, Amy
Fisher, Mary
Jensen, Alexander J.
D’Agnese, Erin
Andruszkiewicz Allan, Elizabeth
Ramón-Laca, Ana
Garber-Yonts, Maya
Labare, Michaela
Parsons, Kim M.
Kelly, Ryan P.
author_sort Gold, Zachary
collection PubMed
description Metabarcoding is a powerful molecular tool for simultaneously surveying hundreds to thousands of species from a single sample, underpinning microbiome and environmental DNA (eDNA) methods. Deriving quantitative estimates of underlying biological communities from metabarcoding is critical for enhancing the utility of such approaches for health and conservation. Recent work has demonstrated that correcting for amplification biases in genetic metabarcoding data can yield quantitative estimates of template DNA concentrations. However, a major source of uncertainty in metabarcoding data stems from non-detections across technical PCR replicates where one replicate fails to detect a species observed in other replicates. Such non-detections are a special case of variability among technical replicates in metabarcoding data. While many sampling and amplification processes underlie observed variation in metabarcoding data, understanding the causes of non-detections is an important step in distinguishing signal from noise in metabarcoding studies. Here, we use both simulated and empirical data to 1) suggest how non-detections may arise in metabarcoding data, 2) outline steps to recognize uninformative data in practice, and 3) identify the conditions under which amplicon sequence data can reliably detect underlying biological signals. We show with both simulations and empirical data that, for a given species, the rate of non-detections among technical replicates is a function of both the template DNA concentration and species-specific amplification efficiency. Consequently, we conclude metabarcoding datasets are strongly affected by (1) deterministic amplification biases during PCR and (2) stochastic sampling of amplicons during sequencing—both of which we can model—but also by (3) stochastic sampling of rare molecules prior to PCR, which remains a frontier for quantitative metabarcoding. Our results highlight the importance of estimating species-specific amplification efficiencies and critically evaluating patterns of non-detection in metabarcoding datasets to better distinguish environmental signal from the noise inherent in molecular detections of rare targets.
format Online
Article
Text
id pubmed-10174484
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-101744842023-05-12 Signal and noise in metabarcoding data Gold, Zachary Shelton, Andrew Olaf Casendino, Helen R. Duprey, Joe Gallego, Ramón Van Cise, Amy Fisher, Mary Jensen, Alexander J. D’Agnese, Erin Andruszkiewicz Allan, Elizabeth Ramón-Laca, Ana Garber-Yonts, Maya Labare, Michaela Parsons, Kim M. Kelly, Ryan P. PLoS One Research Article Metabarcoding is a powerful molecular tool for simultaneously surveying hundreds to thousands of species from a single sample, underpinning microbiome and environmental DNA (eDNA) methods. Deriving quantitative estimates of underlying biological communities from metabarcoding is critical for enhancing the utility of such approaches for health and conservation. Recent work has demonstrated that correcting for amplification biases in genetic metabarcoding data can yield quantitative estimates of template DNA concentrations. However, a major source of uncertainty in metabarcoding data stems from non-detections across technical PCR replicates where one replicate fails to detect a species observed in other replicates. Such non-detections are a special case of variability among technical replicates in metabarcoding data. While many sampling and amplification processes underlie observed variation in metabarcoding data, understanding the causes of non-detections is an important step in distinguishing signal from noise in metabarcoding studies. Here, we use both simulated and empirical data to 1) suggest how non-detections may arise in metabarcoding data, 2) outline steps to recognize uninformative data in practice, and 3) identify the conditions under which amplicon sequence data can reliably detect underlying biological signals. We show with both simulations and empirical data that, for a given species, the rate of non-detections among technical replicates is a function of both the template DNA concentration and species-specific amplification efficiency. Consequently, we conclude metabarcoding datasets are strongly affected by (1) deterministic amplification biases during PCR and (2) stochastic sampling of amplicons during sequencing—both of which we can model—but also by (3) stochastic sampling of rare molecules prior to PCR, which remains a frontier for quantitative metabarcoding. Our results highlight the importance of estimating species-specific amplification efficiencies and critically evaluating patterns of non-detection in metabarcoding datasets to better distinguish environmental signal from the noise inherent in molecular detections of rare targets. Public Library of Science 2023-05-11 /pmc/articles/PMC10174484/ /pubmed/37167310 http://dx.doi.org/10.1371/journal.pone.0285674 Text en https://creativecommons.org/publicdomain/zero/1.0/This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 (https://creativecommons.org/publicdomain/zero/1.0/) public domain dedication.
spellingShingle Research Article
Gold, Zachary
Shelton, Andrew Olaf
Casendino, Helen R.
Duprey, Joe
Gallego, Ramón
Van Cise, Amy
Fisher, Mary
Jensen, Alexander J.
D’Agnese, Erin
Andruszkiewicz Allan, Elizabeth
Ramón-Laca, Ana
Garber-Yonts, Maya
Labare, Michaela
Parsons, Kim M.
Kelly, Ryan P.
Signal and noise in metabarcoding data
title Signal and noise in metabarcoding data
title_full Signal and noise in metabarcoding data
title_fullStr Signal and noise in metabarcoding data
title_full_unstemmed Signal and noise in metabarcoding data
title_short Signal and noise in metabarcoding data
title_sort signal and noise in metabarcoding data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10174484/
https://www.ncbi.nlm.nih.gov/pubmed/37167310
http://dx.doi.org/10.1371/journal.pone.0285674
work_keys_str_mv AT goldzachary signalandnoiseinmetabarcodingdata
AT sheltonandrewolaf signalandnoiseinmetabarcodingdata
AT casendinohelenr signalandnoiseinmetabarcodingdata
AT dupreyjoe signalandnoiseinmetabarcodingdata
AT gallegoramon signalandnoiseinmetabarcodingdata
AT vanciseamy signalandnoiseinmetabarcodingdata
AT fishermary signalandnoiseinmetabarcodingdata
AT jensenalexanderj signalandnoiseinmetabarcodingdata
AT dagneseerin signalandnoiseinmetabarcodingdata
AT andruszkiewiczallanelizabeth signalandnoiseinmetabarcodingdata
AT ramonlacaana signalandnoiseinmetabarcodingdata
AT garberyontsmaya signalandnoiseinmetabarcodingdata
AT labaremichaela signalandnoiseinmetabarcodingdata
AT parsonskimm signalandnoiseinmetabarcodingdata
AT kellyryanp signalandnoiseinmetabarcodingdata