Cargando…

Inferring the Joint Demographic History of Multiple Populations from Multidimensional SNP Frequency Data

Demographic models built from genetic data play important roles in illuminating prehistorical events and serving as null models in genome scans for selection. We introduce an inference method based on the joint frequency spectrum of genetic variants within and between populations. For candidate mode...

Descripción completa

Detalles Bibliográficos
Autores principales: Gutenkunst, Ryan N., Hernandez, Ryan D., Williamson, Scott H., Bustamante, Carlos D.
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2760211/
https://www.ncbi.nlm.nih.gov/pubmed/19851460
http://dx.doi.org/10.1371/journal.pgen.1000695
_version_ 1782172729226559488
author Gutenkunst, Ryan N.
Hernandez, Ryan D.
Williamson, Scott H.
Bustamante, Carlos D.
author_facet Gutenkunst, Ryan N.
Hernandez, Ryan D.
Williamson, Scott H.
Bustamante, Carlos D.
author_sort Gutenkunst, Ryan N.
collection PubMed
description Demographic models built from genetic data play important roles in illuminating prehistorical events and serving as null models in genome scans for selection. We introduce an inference method based on the joint frequency spectrum of genetic variants within and between populations. For candidate models we numerically compute the expected spectrum using a diffusion approximation to the one-locus, two-allele Wright-Fisher process, involving up to three simultaneous populations. Our approach is a composite likelihood scheme, since linkage between neutral loci alters the variance but not the expectation of the frequency spectrum. We thus use bootstraps incorporating linkage to estimate uncertainties for parameters and significance values for hypothesis tests. Our method can also incorporate selection on single sites, predicting the joint distribution of selected alleles among populations experiencing a bevy of evolutionary forces, including expansions, contractions, migrations, and admixture. We model human expansion out of Africa and the settlement of the New World, using 5 Mb of noncoding DNA resequenced in 68 individuals from 4 populations (YRI, CHB, CEU, and MXL) by the Environmental Genome Project. We infer divergence between West African and Eurasian populations 140 thousand years ago (95% confidence interval: 40–270 kya). This is earlier than other genetic studies, in part because we incorporate migration. We estimate the European (CEU) and East Asian (CHB) divergence time to be 23 kya (95% c.i.: 17–43 kya), long after archeological evidence places modern humans in Europe. Finally, we estimate divergence between East Asians (CHB) and Mexican-Americans (MXL) of 22 kya (95% c.i.: 16.3–26.9 kya), and our analysis yields no evidence for subsequent migration. Furthermore, combining our demographic model with a previously estimated distribution of selective effects among newly arising amino acid mutations accurately predicts the frequency spectrum of nonsynonymous variants across three continental populations (YRI, CHB, CEU).
format Text
id pubmed-2760211
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-27602112009-10-23 Inferring the Joint Demographic History of Multiple Populations from Multidimensional SNP Frequency Data Gutenkunst, Ryan N. Hernandez, Ryan D. Williamson, Scott H. Bustamante, Carlos D. PLoS Genet Research Article Demographic models built from genetic data play important roles in illuminating prehistorical events and serving as null models in genome scans for selection. We introduce an inference method based on the joint frequency spectrum of genetic variants within and between populations. For candidate models we numerically compute the expected spectrum using a diffusion approximation to the one-locus, two-allele Wright-Fisher process, involving up to three simultaneous populations. Our approach is a composite likelihood scheme, since linkage between neutral loci alters the variance but not the expectation of the frequency spectrum. We thus use bootstraps incorporating linkage to estimate uncertainties for parameters and significance values for hypothesis tests. Our method can also incorporate selection on single sites, predicting the joint distribution of selected alleles among populations experiencing a bevy of evolutionary forces, including expansions, contractions, migrations, and admixture. We model human expansion out of Africa and the settlement of the New World, using 5 Mb of noncoding DNA resequenced in 68 individuals from 4 populations (YRI, CHB, CEU, and MXL) by the Environmental Genome Project. We infer divergence between West African and Eurasian populations 140 thousand years ago (95% confidence interval: 40–270 kya). This is earlier than other genetic studies, in part because we incorporate migration. We estimate the European (CEU) and East Asian (CHB) divergence time to be 23 kya (95% c.i.: 17–43 kya), long after archeological evidence places modern humans in Europe. Finally, we estimate divergence between East Asians (CHB) and Mexican-Americans (MXL) of 22 kya (95% c.i.: 16.3–26.9 kya), and our analysis yields no evidence for subsequent migration. Furthermore, combining our demographic model with a previously estimated distribution of selective effects among newly arising amino acid mutations accurately predicts the frequency spectrum of nonsynonymous variants across three continental populations (YRI, CHB, CEU). Public Library of Science 2009-10-23 /pmc/articles/PMC2760211/ /pubmed/19851460 http://dx.doi.org/10.1371/journal.pgen.1000695 Text en This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. https://creativecommons.org/publicdomain/zero/1.0/ This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration, which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose.
spellingShingle Research Article
Gutenkunst, Ryan N.
Hernandez, Ryan D.
Williamson, Scott H.
Bustamante, Carlos D.
Inferring the Joint Demographic History of Multiple Populations from Multidimensional SNP Frequency Data
title Inferring the Joint Demographic History of Multiple Populations from Multidimensional SNP Frequency Data
title_full Inferring the Joint Demographic History of Multiple Populations from Multidimensional SNP Frequency Data
title_fullStr Inferring the Joint Demographic History of Multiple Populations from Multidimensional SNP Frequency Data
title_full_unstemmed Inferring the Joint Demographic History of Multiple Populations from Multidimensional SNP Frequency Data
title_short Inferring the Joint Demographic History of Multiple Populations from Multidimensional SNP Frequency Data
title_sort inferring the joint demographic history of multiple populations from multidimensional snp frequency data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2760211/
https://www.ncbi.nlm.nih.gov/pubmed/19851460
http://dx.doi.org/10.1371/journal.pgen.1000695
work_keys_str_mv AT gutenkunstryann inferringthejointdemographichistoryofmultiplepopulationsfrommultidimensionalsnpfrequencydata
AT hernandezryand inferringthejointdemographichistoryofmultiplepopulationsfrommultidimensionalsnpfrequencydata
AT williamsonscotth inferringthejointdemographichistoryofmultiplepopulationsfrommultidimensionalsnpfrequencydata
AT bustamantecarlosd inferringthejointdemographichistoryofmultiplepopulationsfrommultidimensionalsnpfrequencydata