Cargando…

UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts

Human populations feature both discrete and continuous patterns of variation. Current analysis approaches struggle to jointly identify these patterns because of modelling assumptions, mathematical constraints, or numerical challenges. Here we apply uniform manifold approximation and projection (UMAP...

Descripción completa

Detalles Bibliográficos
Autores principales: Diaz-Papkovich, Alex, Anderson-Trocmé, Luke, Ben-Eghan, Chief, Gravel, Simon
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6853336/
https://www.ncbi.nlm.nih.gov/pubmed/31675358
http://dx.doi.org/10.1371/journal.pgen.1008432
_version_ 1783470028287377408
author Diaz-Papkovich, Alex
Anderson-Trocmé, Luke
Ben-Eghan, Chief
Gravel, Simon
author_facet Diaz-Papkovich, Alex
Anderson-Trocmé, Luke
Ben-Eghan, Chief
Gravel, Simon
author_sort Diaz-Papkovich, Alex
collection PubMed
description Human populations feature both discrete and continuous patterns of variation. Current analysis approaches struggle to jointly identify these patterns because of modelling assumptions, mathematical constraints, or numerical challenges. Here we apply uniform manifold approximation and projection (UMAP), a non-linear dimension reduction tool, to three well-studied genotype datasets and discover overlooked subpopulations within the American Hispanic population, fine-scale relationships between geography, genotypes, and phenotypes in the UK population, and cryptic structure in the Thousand Genomes Project data. This approach is well-suited to the influx of large and diverse data and opens new lines of inquiry in population-scale datasets.
format Online
Article
Text
id pubmed-6853336
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-68533362019-11-22 UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts Diaz-Papkovich, Alex Anderson-Trocmé, Luke Ben-Eghan, Chief Gravel, Simon PLoS Genet Research Article Human populations feature both discrete and continuous patterns of variation. Current analysis approaches struggle to jointly identify these patterns because of modelling assumptions, mathematical constraints, or numerical challenges. Here we apply uniform manifold approximation and projection (UMAP), a non-linear dimension reduction tool, to three well-studied genotype datasets and discover overlooked subpopulations within the American Hispanic population, fine-scale relationships between geography, genotypes, and phenotypes in the UK population, and cryptic structure in the Thousand Genomes Project data. This approach is well-suited to the influx of large and diverse data and opens new lines of inquiry in population-scale datasets. Public Library of Science 2019-11-01 /pmc/articles/PMC6853336/ /pubmed/31675358 http://dx.doi.org/10.1371/journal.pgen.1008432 Text en © 2019 Diaz-Papkovich et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Diaz-Papkovich, Alex
Anderson-Trocmé, Luke
Ben-Eghan, Chief
Gravel, Simon
UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts
title UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts
title_full UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts
title_fullStr UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts
title_full_unstemmed UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts
title_short UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts
title_sort umap reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6853336/
https://www.ncbi.nlm.nih.gov/pubmed/31675358
http://dx.doi.org/10.1371/journal.pgen.1008432
work_keys_str_mv AT diazpapkovichalex umaprevealscrypticpopulationstructureandphenotypeheterogeneityinlargegenomiccohorts
AT andersontrocmeluke umaprevealscrypticpopulationstructureandphenotypeheterogeneityinlargegenomiccohorts
AT beneghanchief umaprevealscrypticpopulationstructureandphenotypeheterogeneityinlargegenomiccohorts
AT gravelsimon umaprevealscrypticpopulationstructureandphenotypeheterogeneityinlargegenomiccohorts