Cargando…

Random sampling causes the low reproducibility of rare eukaryotic OTUs in Illumina COI metabarcoding

DNA metabarcoding, the PCR-based profiling of natural communities, is becoming the method of choice for biodiversity monitoring because it circumvents some of the limitations inherent to traditional ecological surveys. However, potential sources of bias that can affect the reproducibility of this me...

Descripción completa

Detalles Bibliográficos
Autores principales: Leray, Matthieu, Knowlton, Nancy
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5364921/
https://www.ncbi.nlm.nih.gov/pubmed/28348924
http://dx.doi.org/10.7717/peerj.3006
_version_ 1782517422143569920
author Leray, Matthieu
Knowlton, Nancy
author_facet Leray, Matthieu
Knowlton, Nancy
author_sort Leray, Matthieu
collection PubMed
description DNA metabarcoding, the PCR-based profiling of natural communities, is becoming the method of choice for biodiversity monitoring because it circumvents some of the limitations inherent to traditional ecological surveys. However, potential sources of bias that can affect the reproducibility of this method remain to be quantified. The interpretation of differences in patterns of sequence abundance and the ecological relevance of rare sequences remain particularly uncertain. Here we used one artificial mock community to explore the significance of abundance patterns and disentangle the effects of two potential biases on data reproducibility: indexed PCR primers and random sampling during Illumina MiSeq sequencing. We amplified a short fragment of the mitochondrial Cytochrome c Oxidase Subunit I (COI) for a single mock sample containing equimolar amounts of total genomic DNA from 34 marine invertebrates belonging to six phyla. We used seven indexed broad-range primers and sequenced the resulting library on two consecutive Illumina MiSeq runs. The total number of Operational Taxonomic Units (OTUs) was ∼4 times higher than expected based on the composition of the mock sample. Moreover, the total number of reads for the 34 components of the mock sample differed by up to three orders of magnitude. However, 79 out of 86 of the unexpected OTUs were represented by <10 sequences that did not appear consistently across replicates. Our data suggest that random sampling of rare OTUs (e.g., small associated fauna such as parasites) accounted for most of variation in OTU presence–absence, whereas biases associated with indexed PCRs accounted for a larger amount of variation in relative abundance patterns. These results suggest that random sampling during sequencing leads to the low reproducibility of rare OTUs. We suggest that the strategy for handling rare OTUs should depend on the objectives of the study. Systematic removal of rare OTUs may avoid inflating diversity based on common β descriptors but will exclude positive records of taxa that are functionally important. Our results further reinforce the need for technical replicates (parallel PCR and sequencing from the same sample) in metabarcoding experimental designs. Data reproducibility should be determined empirically as it will depend upon the sequencing depth, the type of sample, the sequence analysis pipeline, and the number of replicates. Moreover, estimating relative biomasses or abundances based on read counts remains elusive at the OTU level.
format Online
Article
Text
id pubmed-5364921
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-53649212017-03-27 Random sampling causes the low reproducibility of rare eukaryotic OTUs in Illumina COI metabarcoding Leray, Matthieu Knowlton, Nancy PeerJ Ecology DNA metabarcoding, the PCR-based profiling of natural communities, is becoming the method of choice for biodiversity monitoring because it circumvents some of the limitations inherent to traditional ecological surveys. However, potential sources of bias that can affect the reproducibility of this method remain to be quantified. The interpretation of differences in patterns of sequence abundance and the ecological relevance of rare sequences remain particularly uncertain. Here we used one artificial mock community to explore the significance of abundance patterns and disentangle the effects of two potential biases on data reproducibility: indexed PCR primers and random sampling during Illumina MiSeq sequencing. We amplified a short fragment of the mitochondrial Cytochrome c Oxidase Subunit I (COI) for a single mock sample containing equimolar amounts of total genomic DNA from 34 marine invertebrates belonging to six phyla. We used seven indexed broad-range primers and sequenced the resulting library on two consecutive Illumina MiSeq runs. The total number of Operational Taxonomic Units (OTUs) was ∼4 times higher than expected based on the composition of the mock sample. Moreover, the total number of reads for the 34 components of the mock sample differed by up to three orders of magnitude. However, 79 out of 86 of the unexpected OTUs were represented by <10 sequences that did not appear consistently across replicates. Our data suggest that random sampling of rare OTUs (e.g., small associated fauna such as parasites) accounted for most of variation in OTU presence–absence, whereas biases associated with indexed PCRs accounted for a larger amount of variation in relative abundance patterns. These results suggest that random sampling during sequencing leads to the low reproducibility of rare OTUs. We suggest that the strategy for handling rare OTUs should depend on the objectives of the study. Systematic removal of rare OTUs may avoid inflating diversity based on common β descriptors but will exclude positive records of taxa that are functionally important. Our results further reinforce the need for technical replicates (parallel PCR and sequencing from the same sample) in metabarcoding experimental designs. Data reproducibility should be determined empirically as it will depend upon the sequencing depth, the type of sample, the sequence analysis pipeline, and the number of replicates. Moreover, estimating relative biomasses or abundances based on read counts remains elusive at the OTU level. PeerJ Inc. 2017-03-22 /pmc/articles/PMC5364921/ /pubmed/28348924 http://dx.doi.org/10.7717/peerj.3006 Text en ©2017 Leray and Knowlton http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Ecology
Leray, Matthieu
Knowlton, Nancy
Random sampling causes the low reproducibility of rare eukaryotic OTUs in Illumina COI metabarcoding
title Random sampling causes the low reproducibility of rare eukaryotic OTUs in Illumina COI metabarcoding
title_full Random sampling causes the low reproducibility of rare eukaryotic OTUs in Illumina COI metabarcoding
title_fullStr Random sampling causes the low reproducibility of rare eukaryotic OTUs in Illumina COI metabarcoding
title_full_unstemmed Random sampling causes the low reproducibility of rare eukaryotic OTUs in Illumina COI metabarcoding
title_short Random sampling causes the low reproducibility of rare eukaryotic OTUs in Illumina COI metabarcoding
title_sort random sampling causes the low reproducibility of rare eukaryotic otus in illumina coi metabarcoding
topic Ecology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5364921/
https://www.ncbi.nlm.nih.gov/pubmed/28348924
http://dx.doi.org/10.7717/peerj.3006
work_keys_str_mv AT leraymatthieu randomsamplingcausesthelowreproducibilityofrareeukaryoticotusinilluminacoimetabarcoding
AT knowltonnancy randomsamplingcausesthelowreproducibilityofrareeukaryoticotusinilluminacoimetabarcoding