Cargando…

How imputation can mitigate SNP ascertainment Bias

BACKGROUND: Population genetic studies based on genotyped single nucleotide polymorphisms (SNPs) are influenced by a non-random selection of the SNPs included in the used genotyping arrays. The resulting bias in the estimation of allele frequency spectra and population genetics parameters like heter...

Descripción completa

Detalles Bibliográficos
Autores principales: Geibel, Johannes, Reimer, Christian, Pook, Torsten, Weigend, Steffen, Weigend, Annett, Simianer, Henner
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8114708/
https://www.ncbi.nlm.nih.gov/pubmed/33980139
http://dx.doi.org/10.1186/s12864-021-07663-6
_version_ 1783691105399734272
author Geibel, Johannes
Reimer, Christian
Pook, Torsten
Weigend, Steffen
Weigend, Annett
Simianer, Henner
author_facet Geibel, Johannes
Reimer, Christian
Pook, Torsten
Weigend, Steffen
Weigend, Annett
Simianer, Henner
author_sort Geibel, Johannes
collection PubMed
description BACKGROUND: Population genetic studies based on genotyped single nucleotide polymorphisms (SNPs) are influenced by a non-random selection of the SNPs included in the used genotyping arrays. The resulting bias in the estimation of allele frequency spectra and population genetics parameters like heterozygosity and genetic distances relative to whole genome sequencing (WGS) data is known as SNP ascertainment bias. Full correction for this bias requires detailed knowledge of the array design process, which is often not available in practice. This study suggests an alternative approach to mitigate ascertainment bias of a large set of genotyped individuals by using information of a small set of sequenced individuals via imputation without the need for prior knowledge on the array design. RESULTS: The strategy was first tested by simulating additional ascertainment bias with a set of 1566 chickens from 74 populations that were genotyped for the positions of the Affymetrix Axiom™ 580 k Genome-Wide Chicken Array. Imputation accuracy was shown to be consistently higher for populations used for SNP discovery during the simulated array design process. Reference sets of at least one individual per population in the study set led to a strong correction of ascertainment bias for estimates of expected and observed heterozygosity, Wright’s Fixation Index and Nei’s Standard Genetic Distance. In contrast, unbalanced reference sets (overrepresentation of populations compared to the study set) introduced a new bias towards the reference populations. Finally, the array genotypes were imputed to WGS by utilization of reference sets of 74 individuals (one per population) to 98 individuals (additional commercial chickens) and compared with a mixture of individually and pooled sequenced populations. The imputation reduced the slope between heterozygosity estimates of array data and WGS data from 1.94 to 1.26 when using the smaller balanced reference panel and to 1.44 when using the larger but unbalanced reference panel. This generally supported the results from simulation but was less favorable, advocating for a larger reference panel when imputing to WGS. CONCLUSIONS: The results highlight the potential of using imputation for mitigation of SNP ascertainment bias but also underline the need for unbiased reference sets. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-021-07663-6.
format Online
Article
Text
id pubmed-8114708
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-81147082021-05-12 How imputation can mitigate SNP ascertainment Bias Geibel, Johannes Reimer, Christian Pook, Torsten Weigend, Steffen Weigend, Annett Simianer, Henner BMC Genomics Research BACKGROUND: Population genetic studies based on genotyped single nucleotide polymorphisms (SNPs) are influenced by a non-random selection of the SNPs included in the used genotyping arrays. The resulting bias in the estimation of allele frequency spectra and population genetics parameters like heterozygosity and genetic distances relative to whole genome sequencing (WGS) data is known as SNP ascertainment bias. Full correction for this bias requires detailed knowledge of the array design process, which is often not available in practice. This study suggests an alternative approach to mitigate ascertainment bias of a large set of genotyped individuals by using information of a small set of sequenced individuals via imputation without the need for prior knowledge on the array design. RESULTS: The strategy was first tested by simulating additional ascertainment bias with a set of 1566 chickens from 74 populations that were genotyped for the positions of the Affymetrix Axiom™ 580 k Genome-Wide Chicken Array. Imputation accuracy was shown to be consistently higher for populations used for SNP discovery during the simulated array design process. Reference sets of at least one individual per population in the study set led to a strong correction of ascertainment bias for estimates of expected and observed heterozygosity, Wright’s Fixation Index and Nei’s Standard Genetic Distance. In contrast, unbalanced reference sets (overrepresentation of populations compared to the study set) introduced a new bias towards the reference populations. Finally, the array genotypes were imputed to WGS by utilization of reference sets of 74 individuals (one per population) to 98 individuals (additional commercial chickens) and compared with a mixture of individually and pooled sequenced populations. The imputation reduced the slope between heterozygosity estimates of array data and WGS data from 1.94 to 1.26 when using the smaller balanced reference panel and to 1.44 when using the larger but unbalanced reference panel. This generally supported the results from simulation but was less favorable, advocating for a larger reference panel when imputing to WGS. CONCLUSIONS: The results highlight the potential of using imputation for mitigation of SNP ascertainment bias but also underline the need for unbiased reference sets. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-021-07663-6. BioMed Central 2021-05-12 /pmc/articles/PMC8114708/ /pubmed/33980139 http://dx.doi.org/10.1186/s12864-021-07663-6 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Geibel, Johannes
Reimer, Christian
Pook, Torsten
Weigend, Steffen
Weigend, Annett
Simianer, Henner
How imputation can mitigate SNP ascertainment Bias
title How imputation can mitigate SNP ascertainment Bias
title_full How imputation can mitigate SNP ascertainment Bias
title_fullStr How imputation can mitigate SNP ascertainment Bias
title_full_unstemmed How imputation can mitigate SNP ascertainment Bias
title_short How imputation can mitigate SNP ascertainment Bias
title_sort how imputation can mitigate snp ascertainment bias
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8114708/
https://www.ncbi.nlm.nih.gov/pubmed/33980139
http://dx.doi.org/10.1186/s12864-021-07663-6
work_keys_str_mv AT geibeljohannes howimputationcanmitigatesnpascertainmentbias
AT reimerchristian howimputationcanmitigatesnpascertainmentbias
AT pooktorsten howimputationcanmitigatesnpascertainmentbias
AT weigendsteffen howimputationcanmitigatesnpascertainmentbias
AT weigendannett howimputationcanmitigatesnpascertainmentbias
AT simianerhenner howimputationcanmitigatesnpascertainmentbias