Cargando…
On Estimation of Genome Composition in Genetically Admixed Individuals Using Constrained Genomic Regression
Quantifying the population stratification in genotype samples has become a standard procedure for data manipulation before conducting genome wide association studies, as well as for tracing patterns of migration in humans and animals, and for inference about extinct founder populations. The most wid...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5986875/ https://www.ncbi.nlm.nih.gov/pubmed/29896217 http://dx.doi.org/10.3389/fgene.2018.00185 |
_version_ | 1783329005069402112 |
---|---|
author | Boerner, Vinzent Wittenburg, Dörte |
author_facet | Boerner, Vinzent Wittenburg, Dörte |
author_sort | Boerner, Vinzent |
collection | PubMed |
description | Quantifying the population stratification in genotype samples has become a standard procedure for data manipulation before conducting genome wide association studies, as well as for tracing patterns of migration in humans and animals, and for inference about extinct founder populations. The most widely used approach capable of providing biologically interpretable results is a likelihood formulation which allows for estimation of founder genome proportions and founder allele frequency conditional on the observed genotypes. However, if founder allele frequencies are known and samples are dominated by admixed genotypes this approach may lead to biased inference. In addition, processing time increases drastically with the number of genetic markers. This article describes a simplified approach for obtaining biologically meaningful measures of population stratification at the genotype level conditional on known founder allele frequencies. It was tested on cattle and human data sets with 4,022 and 150,000 genetic markers, respectively, and proved to be very accurate in situations where founder poplations were correctly specified, or under-, over-, and miss-specified. Moreover, processing time was only marginally affected by an increase in the number of markers. |
format | Online Article Text |
id | pubmed-5986875 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-59868752018-06-12 On Estimation of Genome Composition in Genetically Admixed Individuals Using Constrained Genomic Regression Boerner, Vinzent Wittenburg, Dörte Front Genet Genetics Quantifying the population stratification in genotype samples has become a standard procedure for data manipulation before conducting genome wide association studies, as well as for tracing patterns of migration in humans and animals, and for inference about extinct founder populations. The most widely used approach capable of providing biologically interpretable results is a likelihood formulation which allows for estimation of founder genome proportions and founder allele frequency conditional on the observed genotypes. However, if founder allele frequencies are known and samples are dominated by admixed genotypes this approach may lead to biased inference. In addition, processing time increases drastically with the number of genetic markers. This article describes a simplified approach for obtaining biologically meaningful measures of population stratification at the genotype level conditional on known founder allele frequencies. It was tested on cattle and human data sets with 4,022 and 150,000 genetic markers, respectively, and proved to be very accurate in situations where founder poplations were correctly specified, or under-, over-, and miss-specified. Moreover, processing time was only marginally affected by an increase in the number of markers. Frontiers Media S.A. 2018-05-29 /pmc/articles/PMC5986875/ /pubmed/29896217 http://dx.doi.org/10.3389/fgene.2018.00185 Text en Copyright © 2018 Boerner and Wittenburg. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Boerner, Vinzent Wittenburg, Dörte On Estimation of Genome Composition in Genetically Admixed Individuals Using Constrained Genomic Regression |
title | On Estimation of Genome Composition in Genetically Admixed Individuals Using Constrained Genomic Regression |
title_full | On Estimation of Genome Composition in Genetically Admixed Individuals Using Constrained Genomic Regression |
title_fullStr | On Estimation of Genome Composition in Genetically Admixed Individuals Using Constrained Genomic Regression |
title_full_unstemmed | On Estimation of Genome Composition in Genetically Admixed Individuals Using Constrained Genomic Regression |
title_short | On Estimation of Genome Composition in Genetically Admixed Individuals Using Constrained Genomic Regression |
title_sort | on estimation of genome composition in genetically admixed individuals using constrained genomic regression |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5986875/ https://www.ncbi.nlm.nih.gov/pubmed/29896217 http://dx.doi.org/10.3389/fgene.2018.00185 |
work_keys_str_mv | AT boernervinzent onestimationofgenomecompositioningeneticallyadmixedindividualsusingconstrainedgenomicregression AT wittenburgdorte onestimationofgenomecompositioningeneticallyadmixedindividualsusingconstrainedgenomicregression |