Cargando…

Modeling of African population history using f-statistics is biased when applying all previously proposed SNP ascertainment schemes

f-statistics have emerged as a first line of analysis for making inferences about demographic history from genome-wide data. Not only are they guaranteed to allow robust tests of the fits of proposed models of population history to data when analyzing full genome sequencing data—that is, all single...

Descripción completa

Detalles Bibliográficos
Autores principales: Flegontov, Pavel, Işıldak, Ulaş, Maier, Robert, Yüncü, Eren, Changmai, Piya, Reich, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10508636/
https://www.ncbi.nlm.nih.gov/pubmed/37676865
http://dx.doi.org/10.1371/journal.pgen.1010931
_version_ 1785107582235443200
author Flegontov, Pavel
Işıldak, Ulaş
Maier, Robert
Yüncü, Eren
Changmai, Piya
Reich, David
author_facet Flegontov, Pavel
Işıldak, Ulaş
Maier, Robert
Yüncü, Eren
Changmai, Piya
Reich, David
author_sort Flegontov, Pavel
collection PubMed
description f-statistics have emerged as a first line of analysis for making inferences about demographic history from genome-wide data. Not only are they guaranteed to allow robust tests of the fits of proposed models of population history to data when analyzing full genome sequencing data—that is, all single nucleotide polymorphisms (SNPs) in the individuals being analyzed—but they are also guaranteed to allow robust tests of models for SNPs ascertained as polymorphic in a population that is an outgroup in a phylogenetic sense to all groups being analyzed. True “outgroup ascertainment” is in practice impossible in humans because our species has arisen from a substructured ancestral population that does not descend from a homogeneous ancestral population going back many hundreds of thousands of years into the past. However, initial studies suggested that non-outgroup-ascertainment schemes might produce robust enough results using f-statistics, and that motivated widespread fitting of models to data using non-outgroup-ascertained SNP panels such as the “Affymetrix Human Origins array” which has been genotyped on thousands of modern individuals from hundreds of populations, or the “1240k” in-solution enrichment reagent which has been the source of about 70% of published genome-wide data for ancient humans. In this study, we show that while analyses of population history using such panels work well for studies of relationships among non-African populations and one African outgroup, when co-modeling more than one sub-Saharan African and/or archaic human groups (Neanderthals and Denisovans), fitting of f-statistics to such SNP sets is expected to frequently lead to false rejection of true demographic histories, and failure to reject incorrect models. Analyzing panels of SNPs polymorphic in archaic humans, which has been suggested as a solution for the ascertainment problem, has limited statistical power and retains important biases. However, by carrying out simulations of diverse demographic histories, we show that bias in inferences based on f-statistics can be minimized by ascertaining on variants common in a union of diverse African groups; such ascertainment retains high statistical power while allowing co-analysis of archaic and modern groups.
format Online
Article
Text
id pubmed-10508636
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-105086362023-09-20 Modeling of African population history using f-statistics is biased when applying all previously proposed SNP ascertainment schemes Flegontov, Pavel Işıldak, Ulaş Maier, Robert Yüncü, Eren Changmai, Piya Reich, David PLoS Genet Research Article f-statistics have emerged as a first line of analysis for making inferences about demographic history from genome-wide data. Not only are they guaranteed to allow robust tests of the fits of proposed models of population history to data when analyzing full genome sequencing data—that is, all single nucleotide polymorphisms (SNPs) in the individuals being analyzed—but they are also guaranteed to allow robust tests of models for SNPs ascertained as polymorphic in a population that is an outgroup in a phylogenetic sense to all groups being analyzed. True “outgroup ascertainment” is in practice impossible in humans because our species has arisen from a substructured ancestral population that does not descend from a homogeneous ancestral population going back many hundreds of thousands of years into the past. However, initial studies suggested that non-outgroup-ascertainment schemes might produce robust enough results using f-statistics, and that motivated widespread fitting of models to data using non-outgroup-ascertained SNP panels such as the “Affymetrix Human Origins array” which has been genotyped on thousands of modern individuals from hundreds of populations, or the “1240k” in-solution enrichment reagent which has been the source of about 70% of published genome-wide data for ancient humans. In this study, we show that while analyses of population history using such panels work well for studies of relationships among non-African populations and one African outgroup, when co-modeling more than one sub-Saharan African and/or archaic human groups (Neanderthals and Denisovans), fitting of f-statistics to such SNP sets is expected to frequently lead to false rejection of true demographic histories, and failure to reject incorrect models. Analyzing panels of SNPs polymorphic in archaic humans, which has been suggested as a solution for the ascertainment problem, has limited statistical power and retains important biases. However, by carrying out simulations of diverse demographic histories, we show that bias in inferences based on f-statistics can be minimized by ascertaining on variants common in a union of diverse African groups; such ascertainment retains high statistical power while allowing co-analysis of archaic and modern groups. Public Library of Science 2023-09-07 /pmc/articles/PMC10508636/ /pubmed/37676865 http://dx.doi.org/10.1371/journal.pgen.1010931 Text en © 2023 Flegontov et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Flegontov, Pavel
Işıldak, Ulaş
Maier, Robert
Yüncü, Eren
Changmai, Piya
Reich, David
Modeling of African population history using f-statistics is biased when applying all previously proposed SNP ascertainment schemes
title Modeling of African population history using f-statistics is biased when applying all previously proposed SNP ascertainment schemes
title_full Modeling of African population history using f-statistics is biased when applying all previously proposed SNP ascertainment schemes
title_fullStr Modeling of African population history using f-statistics is biased when applying all previously proposed SNP ascertainment schemes
title_full_unstemmed Modeling of African population history using f-statistics is biased when applying all previously proposed SNP ascertainment schemes
title_short Modeling of African population history using f-statistics is biased when applying all previously proposed SNP ascertainment schemes
title_sort modeling of african population history using f-statistics is biased when applying all previously proposed snp ascertainment schemes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10508636/
https://www.ncbi.nlm.nih.gov/pubmed/37676865
http://dx.doi.org/10.1371/journal.pgen.1010931
work_keys_str_mv AT flegontovpavel modelingofafricanpopulationhistoryusingfstatisticsisbiasedwhenapplyingallpreviouslyproposedsnpascertainmentschemes
AT isıldakulas modelingofafricanpopulationhistoryusingfstatisticsisbiasedwhenapplyingallpreviouslyproposedsnpascertainmentschemes
AT maierrobert modelingofafricanpopulationhistoryusingfstatisticsisbiasedwhenapplyingallpreviouslyproposedsnpascertainmentschemes
AT yuncueren modelingofafricanpopulationhistoryusingfstatisticsisbiasedwhenapplyingallpreviouslyproposedsnpascertainmentschemes
AT changmaipiya modelingofafricanpopulationhistoryusingfstatisticsisbiasedwhenapplyingallpreviouslyproposedsnpascertainmentschemes
AT reichdavid modelingofafricanpopulationhistoryusingfstatisticsisbiasedwhenapplyingallpreviouslyproposedsnpascertainmentschemes