Cargando…
How do SNP ascertainment schemes and population demographics affect inferences about population history?
BACKGROUND: The selection of variable sites for inclusion in genomic analyses can influence results, especially when exemplar populations are used to determine polymorphic sites. We tested the impact of ascertainment bias on the inference of population genetic parameters using empirical and simulate...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4428227/ https://www.ncbi.nlm.nih.gov/pubmed/25887858 http://dx.doi.org/10.1186/s12864-015-1469-5 |
_version_ | 1782370858029809664 |
---|---|
author | McTavish, Emily Jane Hillis, David M |
author_facet | McTavish, Emily Jane Hillis, David M |
author_sort | McTavish, Emily Jane |
collection | PubMed |
description | BACKGROUND: The selection of variable sites for inclusion in genomic analyses can influence results, especially when exemplar populations are used to determine polymorphic sites. We tested the impact of ascertainment bias on the inference of population genetic parameters using empirical and simulated data representing the three major continental groups of cattle: European, African, and Indian. We simulated data under three demographic models. Each simulated data set was subjected to three ascertainment schemes: (I) random selection; (II) geographically biased selection; and (III) selection biased toward loci polymorphic in multiple groups. Empirical data comprised samples of 25 individuals representing each continental group. These cattle were genotyped for 47,506 loci from the bovine 50 K SNP panel. We compared the inference of population histories for the empirical and simulated data sets across different ascertainment conditions using F(ST) and principal components analysis (PCA). RESULTS: Bias toward shared polymorphism across continental groups is apparent in the empirical SNP data. Bias toward uneven levels of within-group polymorphism decreases estimates of F(ST) between groups. Subpopulation-biased selection of SNPs changes the weighting of principal component axes and can affect inferences about proportions of admixture and population histories using PCA. PCA-based inferences of population relationships are largely congruent across types of ascertainment bias, even when ascertainment bias is strong. CONCLUSIONS: Analyses of ascertainment bias in genomic data have largely been conducted on human data. As genomic analyses are being applied to non-model organisms, and across taxa with deeper divergences, care must be taken to consider the potential for bias in ascertainment of variation to affect inferences. Estimates of F(ST), time of separation, and population divergence as estimated by principal components analysis can be misleading if this bias is not taken into account. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-1469-5) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4428227 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-44282272015-05-13 How do SNP ascertainment schemes and population demographics affect inferences about population history? McTavish, Emily Jane Hillis, David M BMC Genomics Research Article BACKGROUND: The selection of variable sites for inclusion in genomic analyses can influence results, especially when exemplar populations are used to determine polymorphic sites. We tested the impact of ascertainment bias on the inference of population genetic parameters using empirical and simulated data representing the three major continental groups of cattle: European, African, and Indian. We simulated data under three demographic models. Each simulated data set was subjected to three ascertainment schemes: (I) random selection; (II) geographically biased selection; and (III) selection biased toward loci polymorphic in multiple groups. Empirical data comprised samples of 25 individuals representing each continental group. These cattle were genotyped for 47,506 loci from the bovine 50 K SNP panel. We compared the inference of population histories for the empirical and simulated data sets across different ascertainment conditions using F(ST) and principal components analysis (PCA). RESULTS: Bias toward shared polymorphism across continental groups is apparent in the empirical SNP data. Bias toward uneven levels of within-group polymorphism decreases estimates of F(ST) between groups. Subpopulation-biased selection of SNPs changes the weighting of principal component axes and can affect inferences about proportions of admixture and population histories using PCA. PCA-based inferences of population relationships are largely congruent across types of ascertainment bias, even when ascertainment bias is strong. CONCLUSIONS: Analyses of ascertainment bias in genomic data have largely been conducted on human data. As genomic analyses are being applied to non-model organisms, and across taxa with deeper divergences, care must be taken to consider the potential for bias in ascertainment of variation to affect inferences. Estimates of F(ST), time of separation, and population divergence as estimated by principal components analysis can be misleading if this bias is not taken into account. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-1469-5) contains supplementary material, which is available to authorized users. BioMed Central 2015-04-03 /pmc/articles/PMC4428227/ /pubmed/25887858 http://dx.doi.org/10.1186/s12864-015-1469-5 Text en © McTavish and Hillis; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article McTavish, Emily Jane Hillis, David M How do SNP ascertainment schemes and population demographics affect inferences about population history? |
title | How do SNP ascertainment schemes and population demographics affect inferences about population history? |
title_full | How do SNP ascertainment schemes and population demographics affect inferences about population history? |
title_fullStr | How do SNP ascertainment schemes and population demographics affect inferences about population history? |
title_full_unstemmed | How do SNP ascertainment schemes and population demographics affect inferences about population history? |
title_short | How do SNP ascertainment schemes and population demographics affect inferences about population history? |
title_sort | how do snp ascertainment schemes and population demographics affect inferences about population history? |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4428227/ https://www.ncbi.nlm.nih.gov/pubmed/25887858 http://dx.doi.org/10.1186/s12864-015-1469-5 |
work_keys_str_mv | AT mctavishemilyjane howdosnpascertainmentschemesandpopulationdemographicsaffectinferencesaboutpopulationhistory AT hillisdavidm howdosnpascertainmentschemesandpopulationdemographicsaffectinferencesaboutpopulationhistory |