Cargando…
UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts
Human populations feature both discrete and continuous patterns of variation. Current analysis approaches struggle to jointly identify these patterns because of modelling assumptions, mathematical constraints, or numerical challenges. Here we apply uniform manifold approximation and projection (UMAP...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6853336/ https://www.ncbi.nlm.nih.gov/pubmed/31675358 http://dx.doi.org/10.1371/journal.pgen.1008432 |
_version_ | 1783470028287377408 |
---|---|
author | Diaz-Papkovich, Alex Anderson-Trocmé, Luke Ben-Eghan, Chief Gravel, Simon |
author_facet | Diaz-Papkovich, Alex Anderson-Trocmé, Luke Ben-Eghan, Chief Gravel, Simon |
author_sort | Diaz-Papkovich, Alex |
collection | PubMed |
description | Human populations feature both discrete and continuous patterns of variation. Current analysis approaches struggle to jointly identify these patterns because of modelling assumptions, mathematical constraints, or numerical challenges. Here we apply uniform manifold approximation and projection (UMAP), a non-linear dimension reduction tool, to three well-studied genotype datasets and discover overlooked subpopulations within the American Hispanic population, fine-scale relationships between geography, genotypes, and phenotypes in the UK population, and cryptic structure in the Thousand Genomes Project data. This approach is well-suited to the influx of large and diverse data and opens new lines of inquiry in population-scale datasets. |
format | Online Article Text |
id | pubmed-6853336 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-68533362019-11-22 UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts Diaz-Papkovich, Alex Anderson-Trocmé, Luke Ben-Eghan, Chief Gravel, Simon PLoS Genet Research Article Human populations feature both discrete and continuous patterns of variation. Current analysis approaches struggle to jointly identify these patterns because of modelling assumptions, mathematical constraints, or numerical challenges. Here we apply uniform manifold approximation and projection (UMAP), a non-linear dimension reduction tool, to three well-studied genotype datasets and discover overlooked subpopulations within the American Hispanic population, fine-scale relationships between geography, genotypes, and phenotypes in the UK population, and cryptic structure in the Thousand Genomes Project data. This approach is well-suited to the influx of large and diverse data and opens new lines of inquiry in population-scale datasets. Public Library of Science 2019-11-01 /pmc/articles/PMC6853336/ /pubmed/31675358 http://dx.doi.org/10.1371/journal.pgen.1008432 Text en © 2019 Diaz-Papkovich et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Diaz-Papkovich, Alex Anderson-Trocmé, Luke Ben-Eghan, Chief Gravel, Simon UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts |
title | UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts |
title_full | UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts |
title_fullStr | UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts |
title_full_unstemmed | UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts |
title_short | UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts |
title_sort | umap reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6853336/ https://www.ncbi.nlm.nih.gov/pubmed/31675358 http://dx.doi.org/10.1371/journal.pgen.1008432 |
work_keys_str_mv | AT diazpapkovichalex umaprevealscrypticpopulationstructureandphenotypeheterogeneityinlargegenomiccohorts AT andersontrocmeluke umaprevealscrypticpopulationstructureandphenotypeheterogeneityinlargegenomiccohorts AT beneghanchief umaprevealscrypticpopulationstructureandphenotypeheterogeneityinlargegenomiccohorts AT gravelsimon umaprevealscrypticpopulationstructureandphenotypeheterogeneityinlargegenomiccohorts |