Cargando…

Accurate Allele Frequencies from Ultra-low Coverage Pool-Seq Samples in Evolve-and-Resequence Experiments

Evolve-and-resequence (E+R) experiments leverage next-generation sequencing technology to track the allele frequency dynamics of populations as they evolve. While previous work has shown that adaptive alleles can be detected by comparing frequency trajectories from many replicate populations, this p...

Descripción completa

Detalles Bibliográficos
Autores principales: Tilk, Susanne, Bergland, Alan, Goodman, Aaron, Schmidt, Paul, Petrov, Dmitri, Greenblum, Sharon
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Genetics Society of America 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6893198/
https://www.ncbi.nlm.nih.gov/pubmed/31636085
http://dx.doi.org/10.1534/g3.119.400755
_version_ 1783476160071467008
author Tilk, Susanne
Bergland, Alan
Goodman, Aaron
Schmidt, Paul
Petrov, Dmitri
Greenblum, Sharon
author_facet Tilk, Susanne
Bergland, Alan
Goodman, Aaron
Schmidt, Paul
Petrov, Dmitri
Greenblum, Sharon
author_sort Tilk, Susanne
collection PubMed
description Evolve-and-resequence (E+R) experiments leverage next-generation sequencing technology to track the allele frequency dynamics of populations as they evolve. While previous work has shown that adaptive alleles can be detected by comparing frequency trajectories from many replicate populations, this power comes at the expense of high-coverage (>100x) sequencing of many pooled samples, which can be cost-prohibitive. Here, we show that accurate estimates of allele frequencies can be achieved with very shallow sequencing depths (<5x) via inference of known founder haplotypes in small genomic windows. This technique can be used to efficiently estimate frequencies for any number of bi-allelic SNPs in populations of any model organism founded with sequenced homozygous strains. Using both experimentally-pooled and simulated samples of Drosophila melanogaster, we show that haplotype inference can improve allele frequency accuracy by orders of magnitude for up to 50 generations of recombination, and is robust to moderate levels of missing data, as well as different selection regimes. Finally, we show that a simple linear model generated from these simulations can predict the accuracy of haplotype-derived allele frequencies in other model organisms and experimental designs. To make these results broadly accessible for use in E+R experiments, we introduce HAF-pipe, an open-source software tool for calculating haplotype-derived allele frequencies from raw sequencing data. Ultimately, by reducing sequencing costs without sacrificing accuracy, our method facilitates E+R designs with higher replication and resolution, and thereby, increased power to detect adaptive alleles.
format Online
Article
Text
id pubmed-6893198
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Genetics Society of America
record_format MEDLINE/PubMed
spelling pubmed-68931982019-12-05 Accurate Allele Frequencies from Ultra-low Coverage Pool-Seq Samples in Evolve-and-Resequence Experiments Tilk, Susanne Bergland, Alan Goodman, Aaron Schmidt, Paul Petrov, Dmitri Greenblum, Sharon G3 (Bethesda) Investigations Evolve-and-resequence (E+R) experiments leverage next-generation sequencing technology to track the allele frequency dynamics of populations as they evolve. While previous work has shown that adaptive alleles can be detected by comparing frequency trajectories from many replicate populations, this power comes at the expense of high-coverage (>100x) sequencing of many pooled samples, which can be cost-prohibitive. Here, we show that accurate estimates of allele frequencies can be achieved with very shallow sequencing depths (<5x) via inference of known founder haplotypes in small genomic windows. This technique can be used to efficiently estimate frequencies for any number of bi-allelic SNPs in populations of any model organism founded with sequenced homozygous strains. Using both experimentally-pooled and simulated samples of Drosophila melanogaster, we show that haplotype inference can improve allele frequency accuracy by orders of magnitude for up to 50 generations of recombination, and is robust to moderate levels of missing data, as well as different selection regimes. Finally, we show that a simple linear model generated from these simulations can predict the accuracy of haplotype-derived allele frequencies in other model organisms and experimental designs. To make these results broadly accessible for use in E+R experiments, we introduce HAF-pipe, an open-source software tool for calculating haplotype-derived allele frequencies from raw sequencing data. Ultimately, by reducing sequencing costs without sacrificing accuracy, our method facilitates E+R designs with higher replication and resolution, and thereby, increased power to detect adaptive alleles. Genetics Society of America 2019-10-21 /pmc/articles/PMC6893198/ /pubmed/31636085 http://dx.doi.org/10.1534/g3.119.400755 Text en Copyright © 2019 Tilk et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Investigations
Tilk, Susanne
Bergland, Alan
Goodman, Aaron
Schmidt, Paul
Petrov, Dmitri
Greenblum, Sharon
Accurate Allele Frequencies from Ultra-low Coverage Pool-Seq Samples in Evolve-and-Resequence Experiments
title Accurate Allele Frequencies from Ultra-low Coverage Pool-Seq Samples in Evolve-and-Resequence Experiments
title_full Accurate Allele Frequencies from Ultra-low Coverage Pool-Seq Samples in Evolve-and-Resequence Experiments
title_fullStr Accurate Allele Frequencies from Ultra-low Coverage Pool-Seq Samples in Evolve-and-Resequence Experiments
title_full_unstemmed Accurate Allele Frequencies from Ultra-low Coverage Pool-Seq Samples in Evolve-and-Resequence Experiments
title_short Accurate Allele Frequencies from Ultra-low Coverage Pool-Seq Samples in Evolve-and-Resequence Experiments
title_sort accurate allele frequencies from ultra-low coverage pool-seq samples in evolve-and-resequence experiments
topic Investigations
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6893198/
https://www.ncbi.nlm.nih.gov/pubmed/31636085
http://dx.doi.org/10.1534/g3.119.400755
work_keys_str_mv AT tilksusanne accurateallelefrequenciesfromultralowcoveragepoolseqsamplesinevolveandresequenceexperiments
AT berglandalan accurateallelefrequenciesfromultralowcoveragepoolseqsamplesinevolveandresequenceexperiments
AT goodmanaaron accurateallelefrequenciesfromultralowcoveragepoolseqsamplesinevolveandresequenceexperiments
AT schmidtpaul accurateallelefrequenciesfromultralowcoveragepoolseqsamplesinevolveandresequenceexperiments
AT petrovdmitri accurateallelefrequenciesfromultralowcoveragepoolseqsamplesinevolveandresequenceexperiments
AT greenblumsharon accurateallelefrequenciesfromultralowcoveragepoolseqsamplesinevolveandresequenceexperiments