Cargando…

Estimating intraspecific genetic diversity from community DNA metabarcoding data

BACKGROUND: DNA metabarcoding is used to generate species composition data for entire communities. However, sequencing errors in high-throughput sequencing instruments are fairly common, usually requiring reads to be clustered into operational taxonomic units (OTUs), losing information on intraspeci...

Descripción completa

Detalles Bibliográficos
Autores principales: Elbrecht, Vasco, Vamos, Ecaterina Edith, Steinke, Dirk, Leese, Florian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5896493/
https://www.ncbi.nlm.nih.gov/pubmed/29666773
http://dx.doi.org/10.7717/peerj.4644
_version_ 1783313844538441728
author Elbrecht, Vasco
Vamos, Ecaterina Edith
Steinke, Dirk
Leese, Florian
author_facet Elbrecht, Vasco
Vamos, Ecaterina Edith
Steinke, Dirk
Leese, Florian
author_sort Elbrecht, Vasco
collection PubMed
description BACKGROUND: DNA metabarcoding is used to generate species composition data for entire communities. However, sequencing errors in high-throughput sequencing instruments are fairly common, usually requiring reads to be clustered into operational taxonomic units (OTUs), losing information on intraspecific diversity in the process. While Cytochrome c oxidase subunit I (COI) haplotype information is limited in resolving intraspecific diversity it is nevertheless often useful e.g. in a phylogeographic context, helping to formulate hypotheses on taxon distribution and dispersal. METHODS: This study combines sequence denoising strategies, normally applied in microbial research, with additional abundance-based filtering to extract haplotype information from freshwater macroinvertebrate metabarcoding datasets. This novel approach was added to the R package “JAMP” and can be applied to COI amplicon datasets. We tested our haplotyping method by sequencing (i) a single-species mock community composed of 31 individuals with 15 different haplotypes spanning three orders of magnitude in biomass and (ii) 18 monitoring samples each amplified with four different primer sets and two PCR replicates. RESULTS: We detected all 15 haplotypes of the single specimens in the mock community with relaxed filtering and denoising settings. However, up to 480 additional unexpected haplotypes remained in both replicates. Rigorous filtering removes most unexpected haplotypes, but also can discard expected haplotypes mainly from the small specimens. In the monitoring samples, the different primer sets detected 177–200 OTUs, each containing an average of 2.40–3.30 haplotypes per OTU. The derived intraspecific diversity data showed population structures that were consistent between replicates and similar between primer pairs but resolution depended on the primer length. A closer look at abundant taxa in the dataset revealed various population genetic patterns, e.g. the stonefly Taeniopteryx nebulosa and the caddisfly Hydropsyche pellucidula showed a distinct north–south cline with respect to haplotype distribution, while the beetle Oulimnius tuberculatus and the isopod Asellus aquaticus displayed no clear population pattern but differed in genetic diversity. DISCUSSION: We developed a strategy to infer intraspecific genetic diversity from bulk invertebrate metabarcoding data. It needs to be stressed that at this point this metabarcoding-informed haplotyping is not capable of capturing the full diversity present in such samples, due to variation in specimen size, primer bias and loss of sequence variants with low abundance. Nevertheless, for a high number of species intraspecific diversity was recovered, identifying potentially isolated populations and taxa for further more detailed phylogeographic investigation. While we are currently lacking large-scale metabarcoding datasets to fully take advantage of our new approach, metabarcoding-informed haplotyping holds great promise for biomonitoring efforts that not only seek information about species diversity but also underlying genetic diversity.
format Online
Article
Text
id pubmed-5896493
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-58964932018-04-17 Estimating intraspecific genetic diversity from community DNA metabarcoding data Elbrecht, Vasco Vamos, Ecaterina Edith Steinke, Dirk Leese, Florian PeerJ Biogeography BACKGROUND: DNA metabarcoding is used to generate species composition data for entire communities. However, sequencing errors in high-throughput sequencing instruments are fairly common, usually requiring reads to be clustered into operational taxonomic units (OTUs), losing information on intraspecific diversity in the process. While Cytochrome c oxidase subunit I (COI) haplotype information is limited in resolving intraspecific diversity it is nevertheless often useful e.g. in a phylogeographic context, helping to formulate hypotheses on taxon distribution and dispersal. METHODS: This study combines sequence denoising strategies, normally applied in microbial research, with additional abundance-based filtering to extract haplotype information from freshwater macroinvertebrate metabarcoding datasets. This novel approach was added to the R package “JAMP” and can be applied to COI amplicon datasets. We tested our haplotyping method by sequencing (i) a single-species mock community composed of 31 individuals with 15 different haplotypes spanning three orders of magnitude in biomass and (ii) 18 monitoring samples each amplified with four different primer sets and two PCR replicates. RESULTS: We detected all 15 haplotypes of the single specimens in the mock community with relaxed filtering and denoising settings. However, up to 480 additional unexpected haplotypes remained in both replicates. Rigorous filtering removes most unexpected haplotypes, but also can discard expected haplotypes mainly from the small specimens. In the monitoring samples, the different primer sets detected 177–200 OTUs, each containing an average of 2.40–3.30 haplotypes per OTU. The derived intraspecific diversity data showed population structures that were consistent between replicates and similar between primer pairs but resolution depended on the primer length. A closer look at abundant taxa in the dataset revealed various population genetic patterns, e.g. the stonefly Taeniopteryx nebulosa and the caddisfly Hydropsyche pellucidula showed a distinct north–south cline with respect to haplotype distribution, while the beetle Oulimnius tuberculatus and the isopod Asellus aquaticus displayed no clear population pattern but differed in genetic diversity. DISCUSSION: We developed a strategy to infer intraspecific genetic diversity from bulk invertebrate metabarcoding data. It needs to be stressed that at this point this metabarcoding-informed haplotyping is not capable of capturing the full diversity present in such samples, due to variation in specimen size, primer bias and loss of sequence variants with low abundance. Nevertheless, for a high number of species intraspecific diversity was recovered, identifying potentially isolated populations and taxa for further more detailed phylogeographic investigation. While we are currently lacking large-scale metabarcoding datasets to fully take advantage of our new approach, metabarcoding-informed haplotyping holds great promise for biomonitoring efforts that not only seek information about species diversity but also underlying genetic diversity. PeerJ Inc. 2018-04-09 /pmc/articles/PMC5896493/ /pubmed/29666773 http://dx.doi.org/10.7717/peerj.4644 Text en © 2018 Elbrecht et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Biogeography
Elbrecht, Vasco
Vamos, Ecaterina Edith
Steinke, Dirk
Leese, Florian
Estimating intraspecific genetic diversity from community DNA metabarcoding data
title Estimating intraspecific genetic diversity from community DNA metabarcoding data
title_full Estimating intraspecific genetic diversity from community DNA metabarcoding data
title_fullStr Estimating intraspecific genetic diversity from community DNA metabarcoding data
title_full_unstemmed Estimating intraspecific genetic diversity from community DNA metabarcoding data
title_short Estimating intraspecific genetic diversity from community DNA metabarcoding data
title_sort estimating intraspecific genetic diversity from community dna metabarcoding data
topic Biogeography
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5896493/
https://www.ncbi.nlm.nih.gov/pubmed/29666773
http://dx.doi.org/10.7717/peerj.4644
work_keys_str_mv AT elbrechtvasco estimatingintraspecificgeneticdiversityfromcommunitydnametabarcodingdata
AT vamosecaterinaedith estimatingintraspecificgeneticdiversityfromcommunitydnametabarcodingdata
AT steinkedirk estimatingintraspecificgeneticdiversityfromcommunitydnametabarcodingdata
AT leeseflorian estimatingintraspecificgeneticdiversityfromcommunitydnametabarcodingdata