Cargando…
Estimating intraspecific genetic diversity from community DNA metabarcoding data
BACKGROUND: DNA metabarcoding is used to generate species composition data for entire communities. However, sequencing errors in high-throughput sequencing instruments are fairly common, usually requiring reads to be clustered into operational taxonomic units (OTUs), losing information on intraspeci...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5896493/ https://www.ncbi.nlm.nih.gov/pubmed/29666773 http://dx.doi.org/10.7717/peerj.4644 |
_version_ | 1783313844538441728 |
---|---|
author | Elbrecht, Vasco Vamos, Ecaterina Edith Steinke, Dirk Leese, Florian |
author_facet | Elbrecht, Vasco Vamos, Ecaterina Edith Steinke, Dirk Leese, Florian |
author_sort | Elbrecht, Vasco |
collection | PubMed |
description | BACKGROUND: DNA metabarcoding is used to generate species composition data for entire communities. However, sequencing errors in high-throughput sequencing instruments are fairly common, usually requiring reads to be clustered into operational taxonomic units (OTUs), losing information on intraspecific diversity in the process. While Cytochrome c oxidase subunit I (COI) haplotype information is limited in resolving intraspecific diversity it is nevertheless often useful e.g. in a phylogeographic context, helping to formulate hypotheses on taxon distribution and dispersal. METHODS: This study combines sequence denoising strategies, normally applied in microbial research, with additional abundance-based filtering to extract haplotype information from freshwater macroinvertebrate metabarcoding datasets. This novel approach was added to the R package “JAMP” and can be applied to COI amplicon datasets. We tested our haplotyping method by sequencing (i) a single-species mock community composed of 31 individuals with 15 different haplotypes spanning three orders of magnitude in biomass and (ii) 18 monitoring samples each amplified with four different primer sets and two PCR replicates. RESULTS: We detected all 15 haplotypes of the single specimens in the mock community with relaxed filtering and denoising settings. However, up to 480 additional unexpected haplotypes remained in both replicates. Rigorous filtering removes most unexpected haplotypes, but also can discard expected haplotypes mainly from the small specimens. In the monitoring samples, the different primer sets detected 177–200 OTUs, each containing an average of 2.40–3.30 haplotypes per OTU. The derived intraspecific diversity data showed population structures that were consistent between replicates and similar between primer pairs but resolution depended on the primer length. A closer look at abundant taxa in the dataset revealed various population genetic patterns, e.g. the stonefly Taeniopteryx nebulosa and the caddisfly Hydropsyche pellucidula showed a distinct north–south cline with respect to haplotype distribution, while the beetle Oulimnius tuberculatus and the isopod Asellus aquaticus displayed no clear population pattern but differed in genetic diversity. DISCUSSION: We developed a strategy to infer intraspecific genetic diversity from bulk invertebrate metabarcoding data. It needs to be stressed that at this point this metabarcoding-informed haplotyping is not capable of capturing the full diversity present in such samples, due to variation in specimen size, primer bias and loss of sequence variants with low abundance. Nevertheless, for a high number of species intraspecific diversity was recovered, identifying potentially isolated populations and taxa for further more detailed phylogeographic investigation. While we are currently lacking large-scale metabarcoding datasets to fully take advantage of our new approach, metabarcoding-informed haplotyping holds great promise for biomonitoring efforts that not only seek information about species diversity but also underlying genetic diversity. |
format | Online Article Text |
id | pubmed-5896493 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-58964932018-04-17 Estimating intraspecific genetic diversity from community DNA metabarcoding data Elbrecht, Vasco Vamos, Ecaterina Edith Steinke, Dirk Leese, Florian PeerJ Biogeography BACKGROUND: DNA metabarcoding is used to generate species composition data for entire communities. However, sequencing errors in high-throughput sequencing instruments are fairly common, usually requiring reads to be clustered into operational taxonomic units (OTUs), losing information on intraspecific diversity in the process. While Cytochrome c oxidase subunit I (COI) haplotype information is limited in resolving intraspecific diversity it is nevertheless often useful e.g. in a phylogeographic context, helping to formulate hypotheses on taxon distribution and dispersal. METHODS: This study combines sequence denoising strategies, normally applied in microbial research, with additional abundance-based filtering to extract haplotype information from freshwater macroinvertebrate metabarcoding datasets. This novel approach was added to the R package “JAMP” and can be applied to COI amplicon datasets. We tested our haplotyping method by sequencing (i) a single-species mock community composed of 31 individuals with 15 different haplotypes spanning three orders of magnitude in biomass and (ii) 18 monitoring samples each amplified with four different primer sets and two PCR replicates. RESULTS: We detected all 15 haplotypes of the single specimens in the mock community with relaxed filtering and denoising settings. However, up to 480 additional unexpected haplotypes remained in both replicates. Rigorous filtering removes most unexpected haplotypes, but also can discard expected haplotypes mainly from the small specimens. In the monitoring samples, the different primer sets detected 177–200 OTUs, each containing an average of 2.40–3.30 haplotypes per OTU. The derived intraspecific diversity data showed population structures that were consistent between replicates and similar between primer pairs but resolution depended on the primer length. A closer look at abundant taxa in the dataset revealed various population genetic patterns, e.g. the stonefly Taeniopteryx nebulosa and the caddisfly Hydropsyche pellucidula showed a distinct north–south cline with respect to haplotype distribution, while the beetle Oulimnius tuberculatus and the isopod Asellus aquaticus displayed no clear population pattern but differed in genetic diversity. DISCUSSION: We developed a strategy to infer intraspecific genetic diversity from bulk invertebrate metabarcoding data. It needs to be stressed that at this point this metabarcoding-informed haplotyping is not capable of capturing the full diversity present in such samples, due to variation in specimen size, primer bias and loss of sequence variants with low abundance. Nevertheless, for a high number of species intraspecific diversity was recovered, identifying potentially isolated populations and taxa for further more detailed phylogeographic investigation. While we are currently lacking large-scale metabarcoding datasets to fully take advantage of our new approach, metabarcoding-informed haplotyping holds great promise for biomonitoring efforts that not only seek information about species diversity but also underlying genetic diversity. PeerJ Inc. 2018-04-09 /pmc/articles/PMC5896493/ /pubmed/29666773 http://dx.doi.org/10.7717/peerj.4644 Text en © 2018 Elbrecht et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited. |
spellingShingle | Biogeography Elbrecht, Vasco Vamos, Ecaterina Edith Steinke, Dirk Leese, Florian Estimating intraspecific genetic diversity from community DNA metabarcoding data |
title | Estimating intraspecific genetic diversity from community DNA metabarcoding data |
title_full | Estimating intraspecific genetic diversity from community DNA metabarcoding data |
title_fullStr | Estimating intraspecific genetic diversity from community DNA metabarcoding data |
title_full_unstemmed | Estimating intraspecific genetic diversity from community DNA metabarcoding data |
title_short | Estimating intraspecific genetic diversity from community DNA metabarcoding data |
title_sort | estimating intraspecific genetic diversity from community dna metabarcoding data |
topic | Biogeography |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5896493/ https://www.ncbi.nlm.nih.gov/pubmed/29666773 http://dx.doi.org/10.7717/peerj.4644 |
work_keys_str_mv | AT elbrechtvasco estimatingintraspecificgeneticdiversityfromcommunitydnametabarcodingdata AT vamosecaterinaedith estimatingintraspecificgeneticdiversityfromcommunitydnametabarcodingdata AT steinkedirk estimatingintraspecificgeneticdiversityfromcommunitydnametabarcodingdata AT leeseflorian estimatingintraspecificgeneticdiversityfromcommunitydnametabarcodingdata |