Cargando…
How “simple” methodological decisions affect interpretation of population structure based on reduced representation library DNA sequencing: A case study using the lake whitefish
Reduced representation (RRL) sequencing approaches (e.g., RADSeq, genotyping by sequencing) require decisions about how much to invest in genome coverage and sequencing depth, as well as choices of values for adjustable bioinformatics parameters. To empirically explore the importance of these “simpl...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6980518/ https://www.ncbi.nlm.nih.gov/pubmed/31978053 http://dx.doi.org/10.1371/journal.pone.0226608 |
_version_ | 1783490956255821824 |
---|---|
author | Graham, Carly F. Boreham, Douglas R. Manzon, Richard G. Stott, Wendylee Wilson, Joanna Y. Somers, Christopher M. |
author_facet | Graham, Carly F. Boreham, Douglas R. Manzon, Richard G. Stott, Wendylee Wilson, Joanna Y. Somers, Christopher M. |
author_sort | Graham, Carly F. |
collection | PubMed |
description | Reduced representation (RRL) sequencing approaches (e.g., RADSeq, genotyping by sequencing) require decisions about how much to invest in genome coverage and sequencing depth, as well as choices of values for adjustable bioinformatics parameters. To empirically explore the importance of these “simple” methodological decisions, we generated two independent sequencing libraries for the same 142 individual lake whitefish (Coregonus clupeaformis) using a nextRAD RRL approach: (1) a larger number of loci at low sequencing depth based on a 9mer (library A); and (2) fewer loci at higher sequencing depth based on a 10mer (library B). The fish were selected from populations with different levels of expected genetic subdivision. Each library was analyzed using the STACKS pipeline followed by three types of population structure assessment (F(ST), DAPC and ADMIXTURE) with iterative increases in the stringency of sequencing depth and missing data requirements, as well as more specific a priori population maps. Library B was always able to resolve strong population differentiation in all three types of assessment regardless of the selected parameters, largely due to retention of more loci in analyses. In contrast, library A produced more variable results; increasing the minimum sequencing depth threshold (-m) resulted in a reduced number of retained loci, and therefore lost resolution at high -m values for F(ST) and ADMIXTURE, but not DAPC. When detecting fine population differentiation, the population map influenced the number of loci and missing data, which generated artefacts in all downstream analyses tested. Similarly, when examining fine scale population subdivision, library B was robust to changing parameters but library A lost resolution depending on the parameter set. We used library B to examine actual subdivision in our study populations. All three types of analysis found complete subdivision among populations in Lake Huron, ON and Dore Lake, SK, Canada using 10,640 SNP loci. Weak population subdivision was detected in Lake Huron with fish from sites in the north-west, Search Bay, North Point and Hammond Bay, showing slight differentiation. Overall, we show that apparently simple decisions about library construction and bioinformatics parameters can have important impacts on the interpretation of population subdivision. Although potentially more costly on a per-locus basis, early investment in striking a balance between the number of loci and sequencing effort is well worth the reduced genomic coverage for population genetics studies. More conservative stringency settings on STACKS parameters lead to a final dataset that was more consistent and robust when examining both weak and strong population differentiation. Overall, we recommend that researchers approach “simple” methodological decisions with caution, especially when working on non-model species for the first time. |
format | Online Article Text |
id | pubmed-6980518 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-69805182020-02-04 How “simple” methodological decisions affect interpretation of population structure based on reduced representation library DNA sequencing: A case study using the lake whitefish Graham, Carly F. Boreham, Douglas R. Manzon, Richard G. Stott, Wendylee Wilson, Joanna Y. Somers, Christopher M. PLoS One Research Article Reduced representation (RRL) sequencing approaches (e.g., RADSeq, genotyping by sequencing) require decisions about how much to invest in genome coverage and sequencing depth, as well as choices of values for adjustable bioinformatics parameters. To empirically explore the importance of these “simple” methodological decisions, we generated two independent sequencing libraries for the same 142 individual lake whitefish (Coregonus clupeaformis) using a nextRAD RRL approach: (1) a larger number of loci at low sequencing depth based on a 9mer (library A); and (2) fewer loci at higher sequencing depth based on a 10mer (library B). The fish were selected from populations with different levels of expected genetic subdivision. Each library was analyzed using the STACKS pipeline followed by three types of population structure assessment (F(ST), DAPC and ADMIXTURE) with iterative increases in the stringency of sequencing depth and missing data requirements, as well as more specific a priori population maps. Library B was always able to resolve strong population differentiation in all three types of assessment regardless of the selected parameters, largely due to retention of more loci in analyses. In contrast, library A produced more variable results; increasing the minimum sequencing depth threshold (-m) resulted in a reduced number of retained loci, and therefore lost resolution at high -m values for F(ST) and ADMIXTURE, but not DAPC. When detecting fine population differentiation, the population map influenced the number of loci and missing data, which generated artefacts in all downstream analyses tested. Similarly, when examining fine scale population subdivision, library B was robust to changing parameters but library A lost resolution depending on the parameter set. We used library B to examine actual subdivision in our study populations. All three types of analysis found complete subdivision among populations in Lake Huron, ON and Dore Lake, SK, Canada using 10,640 SNP loci. Weak population subdivision was detected in Lake Huron with fish from sites in the north-west, Search Bay, North Point and Hammond Bay, showing slight differentiation. Overall, we show that apparently simple decisions about library construction and bioinformatics parameters can have important impacts on the interpretation of population subdivision. Although potentially more costly on a per-locus basis, early investment in striking a balance between the number of loci and sequencing effort is well worth the reduced genomic coverage for population genetics studies. More conservative stringency settings on STACKS parameters lead to a final dataset that was more consistent and robust when examining both weak and strong population differentiation. Overall, we recommend that researchers approach “simple” methodological decisions with caution, especially when working on non-model species for the first time. Public Library of Science 2020-01-24 /pmc/articles/PMC6980518/ /pubmed/31978053 http://dx.doi.org/10.1371/journal.pone.0226608 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 (https://creativecommons.org/publicdomain/zero/1.0/) public domain dedication. |
spellingShingle | Research Article Graham, Carly F. Boreham, Douglas R. Manzon, Richard G. Stott, Wendylee Wilson, Joanna Y. Somers, Christopher M. How “simple” methodological decisions affect interpretation of population structure based on reduced representation library DNA sequencing: A case study using the lake whitefish |
title | How “simple” methodological decisions affect interpretation of population structure based on reduced representation library DNA sequencing: A case study using the lake whitefish |
title_full | How “simple” methodological decisions affect interpretation of population structure based on reduced representation library DNA sequencing: A case study using the lake whitefish |
title_fullStr | How “simple” methodological decisions affect interpretation of population structure based on reduced representation library DNA sequencing: A case study using the lake whitefish |
title_full_unstemmed | How “simple” methodological decisions affect interpretation of population structure based on reduced representation library DNA sequencing: A case study using the lake whitefish |
title_short | How “simple” methodological decisions affect interpretation of population structure based on reduced representation library DNA sequencing: A case study using the lake whitefish |
title_sort | how “simple” methodological decisions affect interpretation of population structure based on reduced representation library dna sequencing: a case study using the lake whitefish |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6980518/ https://www.ncbi.nlm.nih.gov/pubmed/31978053 http://dx.doi.org/10.1371/journal.pone.0226608 |
work_keys_str_mv | AT grahamcarlyf howsimplemethodologicaldecisionsaffectinterpretationofpopulationstructurebasedonreducedrepresentationlibrarydnasequencingacasestudyusingthelakewhitefish AT borehamdouglasr howsimplemethodologicaldecisionsaffectinterpretationofpopulationstructurebasedonreducedrepresentationlibrarydnasequencingacasestudyusingthelakewhitefish AT manzonrichardg howsimplemethodologicaldecisionsaffectinterpretationofpopulationstructurebasedonreducedrepresentationlibrarydnasequencingacasestudyusingthelakewhitefish AT stottwendylee howsimplemethodologicaldecisionsaffectinterpretationofpopulationstructurebasedonreducedrepresentationlibrarydnasequencingacasestudyusingthelakewhitefish AT wilsonjoannay howsimplemethodologicaldecisionsaffectinterpretationofpopulationstructurebasedonreducedrepresentationlibrarydnasequencingacasestudyusingthelakewhitefish AT somerschristopherm howsimplemethodologicaldecisionsaffectinterpretationofpopulationstructurebasedonreducedrepresentationlibrarydnasequencingacasestudyusingthelakewhitefish |