Cargando…

Quantitative evaluation of nonlinear methods for population structure visualization and inference

Population structure (also called genetic structure and population stratification) is the presence of a systematic difference in allele frequencies between subpopulations in a population as a result of nonrandom mating between individuals. It can be informative of genetic ancestry, and in the contex...

Descripción completa

Detalles Bibliográficos
Autores principales: Ubbens, Jordan, Feldmann, Mitchell J, Stavness, Ian, Sharpe, Andrew G
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9434256/
https://www.ncbi.nlm.nih.gov/pubmed/35900169
http://dx.doi.org/10.1093/g3journal/jkac191
_version_ 1784780825648168960
author Ubbens, Jordan
Feldmann, Mitchell J
Stavness, Ian
Sharpe, Andrew G
author_facet Ubbens, Jordan
Feldmann, Mitchell J
Stavness, Ian
Sharpe, Andrew G
author_sort Ubbens, Jordan
collection PubMed
description Population structure (also called genetic structure and population stratification) is the presence of a systematic difference in allele frequencies between subpopulations in a population as a result of nonrandom mating between individuals. It can be informative of genetic ancestry, and in the context of medical genetics, it is an important confounding variable in genome-wide association studies. Recently, many nonlinear dimensionality reduction techniques have been proposed for the population structure visualization task. However, an objective comparison of these techniques has so far been missing from the literature. In this article, we discuss the previously proposed nonlinear techniques and some of their potential weaknesses. We then propose a novel quantitative evaluation methodology for comparing these nonlinear techniques, based on populations for which pedigree is known a priori either through artificial selection or simulation. Based on this evaluation metric, we find graph-based algorithms such as t-SNE and UMAP to be superior to principal component analysis, while neural network-based methods fall behind.
format Online
Article
Text
id pubmed-9434256
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-94342562022-09-01 Quantitative evaluation of nonlinear methods for population structure visualization and inference Ubbens, Jordan Feldmann, Mitchell J Stavness, Ian Sharpe, Andrew G G3 (Bethesda) Investigation Population structure (also called genetic structure and population stratification) is the presence of a systematic difference in allele frequencies between subpopulations in a population as a result of nonrandom mating between individuals. It can be informative of genetic ancestry, and in the context of medical genetics, it is an important confounding variable in genome-wide association studies. Recently, many nonlinear dimensionality reduction techniques have been proposed for the population structure visualization task. However, an objective comparison of these techniques has so far been missing from the literature. In this article, we discuss the previously proposed nonlinear techniques and some of their potential weaknesses. We then propose a novel quantitative evaluation methodology for comparing these nonlinear techniques, based on populations for which pedigree is known a priori either through artificial selection or simulation. Based on this evaluation metric, we find graph-based algorithms such as t-SNE and UMAP to be superior to principal component analysis, while neural network-based methods fall behind. Oxford University Press 2022-07-28 /pmc/articles/PMC9434256/ /pubmed/35900169 http://dx.doi.org/10.1093/g3journal/jkac191 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of Genetics Society of America. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Investigation
Ubbens, Jordan
Feldmann, Mitchell J
Stavness, Ian
Sharpe, Andrew G
Quantitative evaluation of nonlinear methods for population structure visualization and inference
title Quantitative evaluation of nonlinear methods for population structure visualization and inference
title_full Quantitative evaluation of nonlinear methods for population structure visualization and inference
title_fullStr Quantitative evaluation of nonlinear methods for population structure visualization and inference
title_full_unstemmed Quantitative evaluation of nonlinear methods for population structure visualization and inference
title_short Quantitative evaluation of nonlinear methods for population structure visualization and inference
title_sort quantitative evaluation of nonlinear methods for population structure visualization and inference
topic Investigation
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9434256/
https://www.ncbi.nlm.nih.gov/pubmed/35900169
http://dx.doi.org/10.1093/g3journal/jkac191
work_keys_str_mv AT ubbensjordan quantitativeevaluationofnonlinearmethodsforpopulationstructurevisualizationandinference
AT feldmannmitchellj quantitativeevaluationofnonlinearmethodsforpopulationstructurevisualizationandinference
AT stavnessian quantitativeevaluationofnonlinearmethodsforpopulationstructurevisualizationandinference
AT sharpeandrewg quantitativeevaluationofnonlinearmethodsforpopulationstructurevisualizationandinference