Cargando…

How “simple” methodological decisions affect interpretation of population structure based on reduced representation library DNA sequencing: A case study using the lake whitefish

Reduced representation (RRL) sequencing approaches (e.g., RADSeq, genotyping by sequencing) require decisions about how much to invest in genome coverage and sequencing depth, as well as choices of values for adjustable bioinformatics parameters. To empirically explore the importance of these “simpl...

Descripción completa

Detalles Bibliográficos
Autores principales: Graham, Carly F., Boreham, Douglas R., Manzon, Richard G., Stott, Wendylee, Wilson, Joanna Y., Somers, Christopher M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6980518/
https://www.ncbi.nlm.nih.gov/pubmed/31978053
http://dx.doi.org/10.1371/journal.pone.0226608
_version_ 1783490956255821824
author Graham, Carly F.
Boreham, Douglas R.
Manzon, Richard G.
Stott, Wendylee
Wilson, Joanna Y.
Somers, Christopher M.
author_facet Graham, Carly F.
Boreham, Douglas R.
Manzon, Richard G.
Stott, Wendylee
Wilson, Joanna Y.
Somers, Christopher M.
author_sort Graham, Carly F.
collection PubMed
description Reduced representation (RRL) sequencing approaches (e.g., RADSeq, genotyping by sequencing) require decisions about how much to invest in genome coverage and sequencing depth, as well as choices of values for adjustable bioinformatics parameters. To empirically explore the importance of these “simple” methodological decisions, we generated two independent sequencing libraries for the same 142 individual lake whitefish (Coregonus clupeaformis) using a nextRAD RRL approach: (1) a larger number of loci at low sequencing depth based on a 9mer (library A); and (2) fewer loci at higher sequencing depth based on a 10mer (library B). The fish were selected from populations with different levels of expected genetic subdivision. Each library was analyzed using the STACKS pipeline followed by three types of population structure assessment (F(ST), DAPC and ADMIXTURE) with iterative increases in the stringency of sequencing depth and missing data requirements, as well as more specific a priori population maps. Library B was always able to resolve strong population differentiation in all three types of assessment regardless of the selected parameters, largely due to retention of more loci in analyses. In contrast, library A produced more variable results; increasing the minimum sequencing depth threshold (-m) resulted in a reduced number of retained loci, and therefore lost resolution at high -m values for F(ST) and ADMIXTURE, but not DAPC. When detecting fine population differentiation, the population map influenced the number of loci and missing data, which generated artefacts in all downstream analyses tested. Similarly, when examining fine scale population subdivision, library B was robust to changing parameters but library A lost resolution depending on the parameter set. We used library B to examine actual subdivision in our study populations. All three types of analysis found complete subdivision among populations in Lake Huron, ON and Dore Lake, SK, Canada using 10,640 SNP loci. Weak population subdivision was detected in Lake Huron with fish from sites in the north-west, Search Bay, North Point and Hammond Bay, showing slight differentiation. Overall, we show that apparently simple decisions about library construction and bioinformatics parameters can have important impacts on the interpretation of population subdivision. Although potentially more costly on a per-locus basis, early investment in striking a balance between the number of loci and sequencing effort is well worth the reduced genomic coverage for population genetics studies. More conservative stringency settings on STACKS parameters lead to a final dataset that was more consistent and robust when examining both weak and strong population differentiation. Overall, we recommend that researchers approach “simple” methodological decisions with caution, especially when working on non-model species for the first time.
format Online
Article
Text
id pubmed-6980518
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-69805182020-02-04 How “simple” methodological decisions affect interpretation of population structure based on reduced representation library DNA sequencing: A case study using the lake whitefish Graham, Carly F. Boreham, Douglas R. Manzon, Richard G. Stott, Wendylee Wilson, Joanna Y. Somers, Christopher M. PLoS One Research Article Reduced representation (RRL) sequencing approaches (e.g., RADSeq, genotyping by sequencing) require decisions about how much to invest in genome coverage and sequencing depth, as well as choices of values for adjustable bioinformatics parameters. To empirically explore the importance of these “simple” methodological decisions, we generated two independent sequencing libraries for the same 142 individual lake whitefish (Coregonus clupeaformis) using a nextRAD RRL approach: (1) a larger number of loci at low sequencing depth based on a 9mer (library A); and (2) fewer loci at higher sequencing depth based on a 10mer (library B). The fish were selected from populations with different levels of expected genetic subdivision. Each library was analyzed using the STACKS pipeline followed by three types of population structure assessment (F(ST), DAPC and ADMIXTURE) with iterative increases in the stringency of sequencing depth and missing data requirements, as well as more specific a priori population maps. Library B was always able to resolve strong population differentiation in all three types of assessment regardless of the selected parameters, largely due to retention of more loci in analyses. In contrast, library A produced more variable results; increasing the minimum sequencing depth threshold (-m) resulted in a reduced number of retained loci, and therefore lost resolution at high -m values for F(ST) and ADMIXTURE, but not DAPC. When detecting fine population differentiation, the population map influenced the number of loci and missing data, which generated artefacts in all downstream analyses tested. Similarly, when examining fine scale population subdivision, library B was robust to changing parameters but library A lost resolution depending on the parameter set. We used library B to examine actual subdivision in our study populations. All three types of analysis found complete subdivision among populations in Lake Huron, ON and Dore Lake, SK, Canada using 10,640 SNP loci. Weak population subdivision was detected in Lake Huron with fish from sites in the north-west, Search Bay, North Point and Hammond Bay, showing slight differentiation. Overall, we show that apparently simple decisions about library construction and bioinformatics parameters can have important impacts on the interpretation of population subdivision. Although potentially more costly on a per-locus basis, early investment in striking a balance between the number of loci and sequencing effort is well worth the reduced genomic coverage for population genetics studies. More conservative stringency settings on STACKS parameters lead to a final dataset that was more consistent and robust when examining both weak and strong population differentiation. Overall, we recommend that researchers approach “simple” methodological decisions with caution, especially when working on non-model species for the first time. Public Library of Science 2020-01-24 /pmc/articles/PMC6980518/ /pubmed/31978053 http://dx.doi.org/10.1371/journal.pone.0226608 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 (https://creativecommons.org/publicdomain/zero/1.0/) public domain dedication.
spellingShingle Research Article
Graham, Carly F.
Boreham, Douglas R.
Manzon, Richard G.
Stott, Wendylee
Wilson, Joanna Y.
Somers, Christopher M.
How “simple” methodological decisions affect interpretation of population structure based on reduced representation library DNA sequencing: A case study using the lake whitefish
title How “simple” methodological decisions affect interpretation of population structure based on reduced representation library DNA sequencing: A case study using the lake whitefish
title_full How “simple” methodological decisions affect interpretation of population structure based on reduced representation library DNA sequencing: A case study using the lake whitefish
title_fullStr How “simple” methodological decisions affect interpretation of population structure based on reduced representation library DNA sequencing: A case study using the lake whitefish
title_full_unstemmed How “simple” methodological decisions affect interpretation of population structure based on reduced representation library DNA sequencing: A case study using the lake whitefish
title_short How “simple” methodological decisions affect interpretation of population structure based on reduced representation library DNA sequencing: A case study using the lake whitefish
title_sort how “simple” methodological decisions affect interpretation of population structure based on reduced representation library dna sequencing: a case study using the lake whitefish
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6980518/
https://www.ncbi.nlm.nih.gov/pubmed/31978053
http://dx.doi.org/10.1371/journal.pone.0226608
work_keys_str_mv AT grahamcarlyf howsimplemethodologicaldecisionsaffectinterpretationofpopulationstructurebasedonreducedrepresentationlibrarydnasequencingacasestudyusingthelakewhitefish
AT borehamdouglasr howsimplemethodologicaldecisionsaffectinterpretationofpopulationstructurebasedonreducedrepresentationlibrarydnasequencingacasestudyusingthelakewhitefish
AT manzonrichardg howsimplemethodologicaldecisionsaffectinterpretationofpopulationstructurebasedonreducedrepresentationlibrarydnasequencingacasestudyusingthelakewhitefish
AT stottwendylee howsimplemethodologicaldecisionsaffectinterpretationofpopulationstructurebasedonreducedrepresentationlibrarydnasequencingacasestudyusingthelakewhitefish
AT wilsonjoannay howsimplemethodologicaldecisionsaffectinterpretationofpopulationstructurebasedonreducedrepresentationlibrarydnasequencingacasestudyusingthelakewhitefish
AT somerschristopherm howsimplemethodologicaldecisionsaffectinterpretationofpopulationstructurebasedonreducedrepresentationlibrarydnasequencingacasestudyusingthelakewhitefish