Cargando…
Quantitative evaluation of nonlinear methods for population structure visualization and inference
Population structure (also called genetic structure and population stratification) is the presence of a systematic difference in allele frequencies between subpopulations in a population as a result of nonrandom mating between individuals. It can be informative of genetic ancestry, and in the contex...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9434256/ https://www.ncbi.nlm.nih.gov/pubmed/35900169 http://dx.doi.org/10.1093/g3journal/jkac191 |
_version_ | 1784780825648168960 |
---|---|
author | Ubbens, Jordan Feldmann, Mitchell J Stavness, Ian Sharpe, Andrew G |
author_facet | Ubbens, Jordan Feldmann, Mitchell J Stavness, Ian Sharpe, Andrew G |
author_sort | Ubbens, Jordan |
collection | PubMed |
description | Population structure (also called genetic structure and population stratification) is the presence of a systematic difference in allele frequencies between subpopulations in a population as a result of nonrandom mating between individuals. It can be informative of genetic ancestry, and in the context of medical genetics, it is an important confounding variable in genome-wide association studies. Recently, many nonlinear dimensionality reduction techniques have been proposed for the population structure visualization task. However, an objective comparison of these techniques has so far been missing from the literature. In this article, we discuss the previously proposed nonlinear techniques and some of their potential weaknesses. We then propose a novel quantitative evaluation methodology for comparing these nonlinear techniques, based on populations for which pedigree is known a priori either through artificial selection or simulation. Based on this evaluation metric, we find graph-based algorithms such as t-SNE and UMAP to be superior to principal component analysis, while neural network-based methods fall behind. |
format | Online Article Text |
id | pubmed-9434256 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-94342562022-09-01 Quantitative evaluation of nonlinear methods for population structure visualization and inference Ubbens, Jordan Feldmann, Mitchell J Stavness, Ian Sharpe, Andrew G G3 (Bethesda) Investigation Population structure (also called genetic structure and population stratification) is the presence of a systematic difference in allele frequencies between subpopulations in a population as a result of nonrandom mating between individuals. It can be informative of genetic ancestry, and in the context of medical genetics, it is an important confounding variable in genome-wide association studies. Recently, many nonlinear dimensionality reduction techniques have been proposed for the population structure visualization task. However, an objective comparison of these techniques has so far been missing from the literature. In this article, we discuss the previously proposed nonlinear techniques and some of their potential weaknesses. We then propose a novel quantitative evaluation methodology for comparing these nonlinear techniques, based on populations for which pedigree is known a priori either through artificial selection or simulation. Based on this evaluation metric, we find graph-based algorithms such as t-SNE and UMAP to be superior to principal component analysis, while neural network-based methods fall behind. Oxford University Press 2022-07-28 /pmc/articles/PMC9434256/ /pubmed/35900169 http://dx.doi.org/10.1093/g3journal/jkac191 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of Genetics Society of America. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Investigation Ubbens, Jordan Feldmann, Mitchell J Stavness, Ian Sharpe, Andrew G Quantitative evaluation of nonlinear methods for population structure visualization and inference |
title | Quantitative evaluation of nonlinear methods for population structure visualization and inference |
title_full | Quantitative evaluation of nonlinear methods for population structure visualization and inference |
title_fullStr | Quantitative evaluation of nonlinear methods for population structure visualization and inference |
title_full_unstemmed | Quantitative evaluation of nonlinear methods for population structure visualization and inference |
title_short | Quantitative evaluation of nonlinear methods for population structure visualization and inference |
title_sort | quantitative evaluation of nonlinear methods for population structure visualization and inference |
topic | Investigation |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9434256/ https://www.ncbi.nlm.nih.gov/pubmed/35900169 http://dx.doi.org/10.1093/g3journal/jkac191 |
work_keys_str_mv | AT ubbensjordan quantitativeevaluationofnonlinearmethodsforpopulationstructurevisualizationandinference AT feldmannmitchellj quantitativeevaluationofnonlinearmethodsforpopulationstructurevisualizationandinference AT stavnessian quantitativeevaluationofnonlinearmethodsforpopulationstructurevisualizationandinference AT sharpeandrewg quantitativeevaluationofnonlinearmethodsforpopulationstructurevisualizationandinference |