Cargando…

Visualizing population structure with variational autoencoders

Dimensionality reduction is a common tool for visualization and inference of population structure from genotypes, but popular methods either return too many dimensions for easy plotting (PCA) or fail to preserve global geometry (t-SNE and UMAP). Here we explore the utility of variational autoencoder...

Descripción completa

Detalles Bibliográficos
Autores principales:	Battey, C J, Coffing, Gabrielle C, Kern, Andrew D
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2021
Materias:	Software and Data Resources
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8022710/ https://www.ncbi.nlm.nih.gov/pubmed/33561250 http://dx.doi.org/10.1093/g3journal/jkaa036

_version_	1783674989612892160
author	Battey, C J Coffing, Gabrielle C Kern, Andrew D
author_facet	Battey, C J Coffing, Gabrielle C Kern, Andrew D
author_sort	Battey, C J
collection	PubMed
description	Dimensionality reduction is a common tool for visualization and inference of population structure from genotypes, but popular methods either return too many dimensions for easy plotting (PCA) or fail to preserve global geometry (t-SNE and UMAP). Here we explore the utility of variational autoencoders (VAEs)—generative machine learning models in which a pair of neural networks seek to first compress and then recreate the input data—for visualizing population genetic variation. VAEs incorporate nonlinear relationships, allow users to define the dimensionality of the latent space, and in our tests preserve global geometry better than t-SNE and UMAP. Our implementation, which we call popvae, is available as a command-line python program at github.com/kr-colab/popvae. The approach yields latent embeddings that capture subtle aspects of population structure in humans and Anopheles mosquitoes, and can generate artificial genotypes characteristic of a given sample or population.
format	Online Article Text
id	pubmed-8022710
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-80227102021-04-09 Visualizing population structure with variational autoencoders Battey, C J Coffing, Gabrielle C Kern, Andrew D G3 (Bethesda) Software and Data Resources Dimensionality reduction is a common tool for visualization and inference of population structure from genotypes, but popular methods either return too many dimensions for easy plotting (PCA) or fail to preserve global geometry (t-SNE and UMAP). Here we explore the utility of variational autoencoders (VAEs)—generative machine learning models in which a pair of neural networks seek to first compress and then recreate the input data—for visualizing population genetic variation. VAEs incorporate nonlinear relationships, allow users to define the dimensionality of the latent space, and in our tests preserve global geometry better than t-SNE and UMAP. Our implementation, which we call popvae, is available as a command-line python program at github.com/kr-colab/popvae. The approach yields latent embeddings that capture subtle aspects of population structure in humans and Anopheles mosquitoes, and can generate artificial genotypes characteristic of a given sample or population. Oxford University Press 2021-01-18 /pmc/articles/PMC8022710/ /pubmed/33561250 http://dx.doi.org/10.1093/g3journal/jkaa036 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of Genetics Society of America. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Software and Data Resources Battey, C J Coffing, Gabrielle C Kern, Andrew D Visualizing population structure with variational autoencoders
title	Visualizing population structure with variational autoencoders
title_full	Visualizing population structure with variational autoencoders
title_fullStr	Visualizing population structure with variational autoencoders
title_full_unstemmed	Visualizing population structure with variational autoencoders
title_short	Visualizing population structure with variational autoencoders
title_sort	visualizing population structure with variational autoencoders
topic	Software and Data Resources
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8022710/ https://www.ncbi.nlm.nih.gov/pubmed/33561250 http://dx.doi.org/10.1093/g3journal/jkaa036
work_keys_str_mv	AT batteycj visualizingpopulationstructurewithvariationalautoencoders AT coffinggabriellec visualizingpopulationstructurewithvariationalautoencoders AT kernandrewd visualizingpopulationstructurewithvariationalautoencoders

Visualizing population structure with variational autoencoders

Ejemplares similares