Cargando…
Archetypal Analysis for population genetics
The estimation of genetic clusters using genomic data has application from genome-wide association studies (GWAS) to demographic history to polygenic risk scores (PRS) and is expected to play an important role in the analyses of increasingly diverse, large-scale cohorts. However, existing methods ar...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9451066/ https://www.ncbi.nlm.nih.gov/pubmed/36007005 http://dx.doi.org/10.1371/journal.pcbi.1010301 |
_version_ | 1784784657583177728 |
---|---|
author | Gimbernat-Mayol, Julia Dominguez Mantes, Albert Bustamante, Carlos D. Mas Montserrat, Daniel Ioannidis, Alexander G. |
author_facet | Gimbernat-Mayol, Julia Dominguez Mantes, Albert Bustamante, Carlos D. Mas Montserrat, Daniel Ioannidis, Alexander G. |
author_sort | Gimbernat-Mayol, Julia |
collection | PubMed |
description | The estimation of genetic clusters using genomic data has application from genome-wide association studies (GWAS) to demographic history to polygenic risk scores (PRS) and is expected to play an important role in the analyses of increasingly diverse, large-scale cohorts. However, existing methods are computationally-intensive, prohibitively so in the case of nationwide biobanks. Here we explore Archetypal Analysis as an efficient, unsupervised approach for identifying genetic clusters and for associating individuals with them. Such unsupervised approaches help avoid conflating socially constructed ethnic labels with genetic clusters by eliminating the need for exogenous training labels. We show that Archetypal Analysis yields similar cluster structure to existing unsupervised methods such as ADMIXTURE and provides interpretative advantages. More importantly, we show that since Archetypal Analysis can be used with lower-dimensional representations of genetic data, significant reductions in computational time and memory requirements are possible. When Archetypal Analysis is run in such a fashion, it takes several orders of magnitude less compute time than the current standard, ADMIXTURE. Finally, we demonstrate uses ranging across datasets from humans to canids. |
format | Online Article Text |
id | pubmed-9451066 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-94510662022-09-08 Archetypal Analysis for population genetics Gimbernat-Mayol, Julia Dominguez Mantes, Albert Bustamante, Carlos D. Mas Montserrat, Daniel Ioannidis, Alexander G. PLoS Comput Biol Research Article The estimation of genetic clusters using genomic data has application from genome-wide association studies (GWAS) to demographic history to polygenic risk scores (PRS) and is expected to play an important role in the analyses of increasingly diverse, large-scale cohorts. However, existing methods are computationally-intensive, prohibitively so in the case of nationwide biobanks. Here we explore Archetypal Analysis as an efficient, unsupervised approach for identifying genetic clusters and for associating individuals with them. Such unsupervised approaches help avoid conflating socially constructed ethnic labels with genetic clusters by eliminating the need for exogenous training labels. We show that Archetypal Analysis yields similar cluster structure to existing unsupervised methods such as ADMIXTURE and provides interpretative advantages. More importantly, we show that since Archetypal Analysis can be used with lower-dimensional representations of genetic data, significant reductions in computational time and memory requirements are possible. When Archetypal Analysis is run in such a fashion, it takes several orders of magnitude less compute time than the current standard, ADMIXTURE. Finally, we demonstrate uses ranging across datasets from humans to canids. Public Library of Science 2022-08-25 /pmc/articles/PMC9451066/ /pubmed/36007005 http://dx.doi.org/10.1371/journal.pcbi.1010301 Text en © 2022 Gimbernat-Mayol et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Gimbernat-Mayol, Julia Dominguez Mantes, Albert Bustamante, Carlos D. Mas Montserrat, Daniel Ioannidis, Alexander G. Archetypal Analysis for population genetics |
title | Archetypal Analysis for population genetics |
title_full | Archetypal Analysis for population genetics |
title_fullStr | Archetypal Analysis for population genetics |
title_full_unstemmed | Archetypal Analysis for population genetics |
title_short | Archetypal Analysis for population genetics |
title_sort | archetypal analysis for population genetics |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9451066/ https://www.ncbi.nlm.nih.gov/pubmed/36007005 http://dx.doi.org/10.1371/journal.pcbi.1010301 |
work_keys_str_mv | AT gimbernatmayoljulia archetypalanalysisforpopulationgenetics AT dominguezmantesalbert archetypalanalysisforpopulationgenetics AT bustamantecarlosd archetypalanalysisforpopulationgenetics AT masmontserratdaniel archetypalanalysisforpopulationgenetics AT ioannidisalexanderg archetypalanalysisforpopulationgenetics |