Cargando…

On Estimation of Genome Composition in Genetically Admixed Individuals Using Constrained Genomic Regression

Quantifying the population stratification in genotype samples has become a standard procedure for data manipulation before conducting genome wide association studies, as well as for tracing patterns of migration in humans and animals, and for inference about extinct founder populations. The most wid...

Descripción completa

Detalles Bibliográficos
Autores principales: Boerner, Vinzent, Wittenburg, Dörte
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5986875/
https://www.ncbi.nlm.nih.gov/pubmed/29896217
http://dx.doi.org/10.3389/fgene.2018.00185
_version_ 1783329005069402112
author Boerner, Vinzent
Wittenburg, Dörte
author_facet Boerner, Vinzent
Wittenburg, Dörte
author_sort Boerner, Vinzent
collection PubMed
description Quantifying the population stratification in genotype samples has become a standard procedure for data manipulation before conducting genome wide association studies, as well as for tracing patterns of migration in humans and animals, and for inference about extinct founder populations. The most widely used approach capable of providing biologically interpretable results is a likelihood formulation which allows for estimation of founder genome proportions and founder allele frequency conditional on the observed genotypes. However, if founder allele frequencies are known and samples are dominated by admixed genotypes this approach may lead to biased inference. In addition, processing time increases drastically with the number of genetic markers. This article describes a simplified approach for obtaining biologically meaningful measures of population stratification at the genotype level conditional on known founder allele frequencies. It was tested on cattle and human data sets with 4,022 and 150,000 genetic markers, respectively, and proved to be very accurate in situations where founder poplations were correctly specified, or under-, over-, and miss-specified. Moreover, processing time was only marginally affected by an increase in the number of markers.
format Online
Article
Text
id pubmed-5986875
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-59868752018-06-12 On Estimation of Genome Composition in Genetically Admixed Individuals Using Constrained Genomic Regression Boerner, Vinzent Wittenburg, Dörte Front Genet Genetics Quantifying the population stratification in genotype samples has become a standard procedure for data manipulation before conducting genome wide association studies, as well as for tracing patterns of migration in humans and animals, and for inference about extinct founder populations. The most widely used approach capable of providing biologically interpretable results is a likelihood formulation which allows for estimation of founder genome proportions and founder allele frequency conditional on the observed genotypes. However, if founder allele frequencies are known and samples are dominated by admixed genotypes this approach may lead to biased inference. In addition, processing time increases drastically with the number of genetic markers. This article describes a simplified approach for obtaining biologically meaningful measures of population stratification at the genotype level conditional on known founder allele frequencies. It was tested on cattle and human data sets with 4,022 and 150,000 genetic markers, respectively, and proved to be very accurate in situations where founder poplations were correctly specified, or under-, over-, and miss-specified. Moreover, processing time was only marginally affected by an increase in the number of markers. Frontiers Media S.A. 2018-05-29 /pmc/articles/PMC5986875/ /pubmed/29896217 http://dx.doi.org/10.3389/fgene.2018.00185 Text en Copyright © 2018 Boerner and Wittenburg. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Boerner, Vinzent
Wittenburg, Dörte
On Estimation of Genome Composition in Genetically Admixed Individuals Using Constrained Genomic Regression
title On Estimation of Genome Composition in Genetically Admixed Individuals Using Constrained Genomic Regression
title_full On Estimation of Genome Composition in Genetically Admixed Individuals Using Constrained Genomic Regression
title_fullStr On Estimation of Genome Composition in Genetically Admixed Individuals Using Constrained Genomic Regression
title_full_unstemmed On Estimation of Genome Composition in Genetically Admixed Individuals Using Constrained Genomic Regression
title_short On Estimation of Genome Composition in Genetically Admixed Individuals Using Constrained Genomic Regression
title_sort on estimation of genome composition in genetically admixed individuals using constrained genomic regression
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5986875/
https://www.ncbi.nlm.nih.gov/pubmed/29896217
http://dx.doi.org/10.3389/fgene.2018.00185
work_keys_str_mv AT boernervinzent onestimationofgenomecompositioningeneticallyadmixedindividualsusingconstrainedgenomicregression
AT wittenburgdorte onestimationofgenomecompositioningeneticallyadmixedindividualsusingconstrainedgenomicregression