Cargando…
Random sampling causes the low reproducibility of rare eukaryotic OTUs in Illumina COI metabarcoding
DNA metabarcoding, the PCR-based profiling of natural communities, is becoming the method of choice for biodiversity monitoring because it circumvents some of the limitations inherent to traditional ecological surveys. However, potential sources of bias that can affect the reproducibility of this me...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5364921/ https://www.ncbi.nlm.nih.gov/pubmed/28348924 http://dx.doi.org/10.7717/peerj.3006 |
_version_ | 1782517422143569920 |
---|---|
author | Leray, Matthieu Knowlton, Nancy |
author_facet | Leray, Matthieu Knowlton, Nancy |
author_sort | Leray, Matthieu |
collection | PubMed |
description | DNA metabarcoding, the PCR-based profiling of natural communities, is becoming the method of choice for biodiversity monitoring because it circumvents some of the limitations inherent to traditional ecological surveys. However, potential sources of bias that can affect the reproducibility of this method remain to be quantified. The interpretation of differences in patterns of sequence abundance and the ecological relevance of rare sequences remain particularly uncertain. Here we used one artificial mock community to explore the significance of abundance patterns and disentangle the effects of two potential biases on data reproducibility: indexed PCR primers and random sampling during Illumina MiSeq sequencing. We amplified a short fragment of the mitochondrial Cytochrome c Oxidase Subunit I (COI) for a single mock sample containing equimolar amounts of total genomic DNA from 34 marine invertebrates belonging to six phyla. We used seven indexed broad-range primers and sequenced the resulting library on two consecutive Illumina MiSeq runs. The total number of Operational Taxonomic Units (OTUs) was ∼4 times higher than expected based on the composition of the mock sample. Moreover, the total number of reads for the 34 components of the mock sample differed by up to three orders of magnitude. However, 79 out of 86 of the unexpected OTUs were represented by <10 sequences that did not appear consistently across replicates. Our data suggest that random sampling of rare OTUs (e.g., small associated fauna such as parasites) accounted for most of variation in OTU presence–absence, whereas biases associated with indexed PCRs accounted for a larger amount of variation in relative abundance patterns. These results suggest that random sampling during sequencing leads to the low reproducibility of rare OTUs. We suggest that the strategy for handling rare OTUs should depend on the objectives of the study. Systematic removal of rare OTUs may avoid inflating diversity based on common β descriptors but will exclude positive records of taxa that are functionally important. Our results further reinforce the need for technical replicates (parallel PCR and sequencing from the same sample) in metabarcoding experimental designs. Data reproducibility should be determined empirically as it will depend upon the sequencing depth, the type of sample, the sequence analysis pipeline, and the number of replicates. Moreover, estimating relative biomasses or abundances based on read counts remains elusive at the OTU level. |
format | Online Article Text |
id | pubmed-5364921 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-53649212017-03-27 Random sampling causes the low reproducibility of rare eukaryotic OTUs in Illumina COI metabarcoding Leray, Matthieu Knowlton, Nancy PeerJ Ecology DNA metabarcoding, the PCR-based profiling of natural communities, is becoming the method of choice for biodiversity monitoring because it circumvents some of the limitations inherent to traditional ecological surveys. However, potential sources of bias that can affect the reproducibility of this method remain to be quantified. The interpretation of differences in patterns of sequence abundance and the ecological relevance of rare sequences remain particularly uncertain. Here we used one artificial mock community to explore the significance of abundance patterns and disentangle the effects of two potential biases on data reproducibility: indexed PCR primers and random sampling during Illumina MiSeq sequencing. We amplified a short fragment of the mitochondrial Cytochrome c Oxidase Subunit I (COI) for a single mock sample containing equimolar amounts of total genomic DNA from 34 marine invertebrates belonging to six phyla. We used seven indexed broad-range primers and sequenced the resulting library on two consecutive Illumina MiSeq runs. The total number of Operational Taxonomic Units (OTUs) was ∼4 times higher than expected based on the composition of the mock sample. Moreover, the total number of reads for the 34 components of the mock sample differed by up to three orders of magnitude. However, 79 out of 86 of the unexpected OTUs were represented by <10 sequences that did not appear consistently across replicates. Our data suggest that random sampling of rare OTUs (e.g., small associated fauna such as parasites) accounted for most of variation in OTU presence–absence, whereas biases associated with indexed PCRs accounted for a larger amount of variation in relative abundance patterns. These results suggest that random sampling during sequencing leads to the low reproducibility of rare OTUs. We suggest that the strategy for handling rare OTUs should depend on the objectives of the study. Systematic removal of rare OTUs may avoid inflating diversity based on common β descriptors but will exclude positive records of taxa that are functionally important. Our results further reinforce the need for technical replicates (parallel PCR and sequencing from the same sample) in metabarcoding experimental designs. Data reproducibility should be determined empirically as it will depend upon the sequencing depth, the type of sample, the sequence analysis pipeline, and the number of replicates. Moreover, estimating relative biomasses or abundances based on read counts remains elusive at the OTU level. PeerJ Inc. 2017-03-22 /pmc/articles/PMC5364921/ /pubmed/28348924 http://dx.doi.org/10.7717/peerj.3006 Text en ©2017 Leray and Knowlton http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited. |
spellingShingle | Ecology Leray, Matthieu Knowlton, Nancy Random sampling causes the low reproducibility of rare eukaryotic OTUs in Illumina COI metabarcoding |
title | Random sampling causes the low reproducibility of rare eukaryotic OTUs in Illumina COI metabarcoding |
title_full | Random sampling causes the low reproducibility of rare eukaryotic OTUs in Illumina COI metabarcoding |
title_fullStr | Random sampling causes the low reproducibility of rare eukaryotic OTUs in Illumina COI metabarcoding |
title_full_unstemmed | Random sampling causes the low reproducibility of rare eukaryotic OTUs in Illumina COI metabarcoding |
title_short | Random sampling causes the low reproducibility of rare eukaryotic OTUs in Illumina COI metabarcoding |
title_sort | random sampling causes the low reproducibility of rare eukaryotic otus in illumina coi metabarcoding |
topic | Ecology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5364921/ https://www.ncbi.nlm.nih.gov/pubmed/28348924 http://dx.doi.org/10.7717/peerj.3006 |
work_keys_str_mv | AT leraymatthieu randomsamplingcausesthelowreproducibilityofrareeukaryoticotusinilluminacoimetabarcoding AT knowltonnancy randomsamplingcausesthelowreproducibilityofrareeukaryoticotusinilluminacoimetabarcoding |