Cargando…

An approximate full-likelihood method for inferring selection and allele frequency trajectories from DNA sequence data

Most current methods for detecting natural selection from DNA sequence data are limited in that they are either based on summary statistics or a composite likelihood, and as a consequence, do not make full use of the information available in DNA sequence data. We here present a new importance sampli...

Descripción completa

Detalles Bibliográficos
Autores principales: Stern, Aaron J., Wilton, Peter R., Nielsen, Rasmus
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6760815/
https://www.ncbi.nlm.nih.gov/pubmed/31518343
http://dx.doi.org/10.1371/journal.pgen.1008384
_version_ 1783453924322181120
author Stern, Aaron J.
Wilton, Peter R.
Nielsen, Rasmus
author_facet Stern, Aaron J.
Wilton, Peter R.
Nielsen, Rasmus
author_sort Stern, Aaron J.
collection PubMed
description Most current methods for detecting natural selection from DNA sequence data are limited in that they are either based on summary statistics or a composite likelihood, and as a consequence, do not make full use of the information available in DNA sequence data. We here present a new importance sampling approach for approximating the full likelihood function for the selection coefficient. Our method CLUES treats the ancestral recombination graph (ARG) as a latent variable that is integrated out using previously published Markov Chain Monte Carlo (MCMC) methods. The method can be used for detecting selection, estimating selection coefficients, testing models of changes in the strength of selection, estimating the time of the start of a selective sweep, and for inferring the allele frequency trajectory of a selected or neutral allele. We perform extensive simulations to evaluate the method and show that it uniformly improves power to detect selection compared to current popular methods such as nSL and SDS, and can provide reliable inferences of allele frequency trajectories under many conditions. We also explore the potential of our method to detect extremely recent changes in the strength of selection. We use the method to infer the past allele frequency trajectory for a lactase persistence SNP (MCM6) in Europeans. We also infer the trajectory of a SNP (EDAR) in Han Chinese, finding evidence that this allele’s age is much older than previously claimed. We also study a set of 11 pigmentation-associated variants. Several genes show evidence of strong selection particularly within the last 5,000 years, including ASIP, KITLG, and TYR. However, selection on OCA2/HERC2 seems to be much older and, in contrast to previous claims, we find no evidence of selection on TYRP1.
format Online
Article
Text
id pubmed-6760815
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-67608152019-10-04 An approximate full-likelihood method for inferring selection and allele frequency trajectories from DNA sequence data Stern, Aaron J. Wilton, Peter R. Nielsen, Rasmus PLoS Genet Research Article Most current methods for detecting natural selection from DNA sequence data are limited in that they are either based on summary statistics or a composite likelihood, and as a consequence, do not make full use of the information available in DNA sequence data. We here present a new importance sampling approach for approximating the full likelihood function for the selection coefficient. Our method CLUES treats the ancestral recombination graph (ARG) as a latent variable that is integrated out using previously published Markov Chain Monte Carlo (MCMC) methods. The method can be used for detecting selection, estimating selection coefficients, testing models of changes in the strength of selection, estimating the time of the start of a selective sweep, and for inferring the allele frequency trajectory of a selected or neutral allele. We perform extensive simulations to evaluate the method and show that it uniformly improves power to detect selection compared to current popular methods such as nSL and SDS, and can provide reliable inferences of allele frequency trajectories under many conditions. We also explore the potential of our method to detect extremely recent changes in the strength of selection. We use the method to infer the past allele frequency trajectory for a lactase persistence SNP (MCM6) in Europeans. We also infer the trajectory of a SNP (EDAR) in Han Chinese, finding evidence that this allele’s age is much older than previously claimed. We also study a set of 11 pigmentation-associated variants. Several genes show evidence of strong selection particularly within the last 5,000 years, including ASIP, KITLG, and TYR. However, selection on OCA2/HERC2 seems to be much older and, in contrast to previous claims, we find no evidence of selection on TYRP1. Public Library of Science 2019-09-13 /pmc/articles/PMC6760815/ /pubmed/31518343 http://dx.doi.org/10.1371/journal.pgen.1008384 Text en © 2019 Stern et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Stern, Aaron J.
Wilton, Peter R.
Nielsen, Rasmus
An approximate full-likelihood method for inferring selection and allele frequency trajectories from DNA sequence data
title An approximate full-likelihood method for inferring selection and allele frequency trajectories from DNA sequence data
title_full An approximate full-likelihood method for inferring selection and allele frequency trajectories from DNA sequence data
title_fullStr An approximate full-likelihood method for inferring selection and allele frequency trajectories from DNA sequence data
title_full_unstemmed An approximate full-likelihood method for inferring selection and allele frequency trajectories from DNA sequence data
title_short An approximate full-likelihood method for inferring selection and allele frequency trajectories from DNA sequence data
title_sort approximate full-likelihood method for inferring selection and allele frequency trajectories from dna sequence data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6760815/
https://www.ncbi.nlm.nih.gov/pubmed/31518343
http://dx.doi.org/10.1371/journal.pgen.1008384
work_keys_str_mv AT sternaaronj anapproximatefulllikelihoodmethodforinferringselectionandallelefrequencytrajectoriesfromdnasequencedata
AT wiltonpeterr anapproximatefulllikelihoodmethodforinferringselectionandallelefrequencytrajectoriesfromdnasequencedata
AT nielsenrasmus anapproximatefulllikelihoodmethodforinferringselectionandallelefrequencytrajectoriesfromdnasequencedata
AT sternaaronj approximatefulllikelihoodmethodforinferringselectionandallelefrequencytrajectoriesfromdnasequencedata
AT wiltonpeterr approximatefulllikelihoodmethodforinferringselectionandallelefrequencytrajectoriesfromdnasequencedata
AT nielsenrasmus approximatefulllikelihoodmethodforinferringselectionandallelefrequencytrajectoriesfromdnasequencedata