Cargando…

A Genealogical Interpretation of Principal Components Analysis

Principal components analysis, PCA, is a statistical method commonly used in population genetics to identify structure in the distribution of genetic variation across geographical location and ethnic background. However, while the method is often used to inform about historical demographic processes...

Descripción completa

Detalles Bibliográficos
Autor principal: McVean, Gil
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2757795/
https://www.ncbi.nlm.nih.gov/pubmed/19834557
http://dx.doi.org/10.1371/journal.pgen.1000686
_version_ 1782172552377925632
author McVean, Gil
author_facet McVean, Gil
author_sort McVean, Gil
collection PubMed
description Principal components analysis, PCA, is a statistical method commonly used in population genetics to identify structure in the distribution of genetic variation across geographical location and ethnic background. However, while the method is often used to inform about historical demographic processes, little is known about the relationship between fundamental demographic parameters and the projection of samples onto the primary axes. Here I show that for SNP data the projection of samples onto the principal components can be obtained directly from considering the average coalescent times between pairs of haploid genomes. The result provides a framework for interpreting PCA projections in terms of underlying processes, including migration, geographical isolation, and admixture. I also demonstrate a link between PCA and Wright's f(st) and show that SNP ascertainment has a largely simple and predictable effect on the projection of samples. Using examples from human genetics, I discuss the application of these results to empirical data and the implications for inference.
format Text
id pubmed-2757795
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-27577952009-10-16 A Genealogical Interpretation of Principal Components Analysis McVean, Gil PLoS Genet Research Article Principal components analysis, PCA, is a statistical method commonly used in population genetics to identify structure in the distribution of genetic variation across geographical location and ethnic background. However, while the method is often used to inform about historical demographic processes, little is known about the relationship between fundamental demographic parameters and the projection of samples onto the primary axes. Here I show that for SNP data the projection of samples onto the principal components can be obtained directly from considering the average coalescent times between pairs of haploid genomes. The result provides a framework for interpreting PCA projections in terms of underlying processes, including migration, geographical isolation, and admixture. I also demonstrate a link between PCA and Wright's f(st) and show that SNP ascertainment has a largely simple and predictable effect on the projection of samples. Using examples from human genetics, I discuss the application of these results to empirical data and the implications for inference. Public Library of Science 2009-10-16 /pmc/articles/PMC2757795/ /pubmed/19834557 http://dx.doi.org/10.1371/journal.pgen.1000686 Text en Gil McVean. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
McVean, Gil
A Genealogical Interpretation of Principal Components Analysis
title A Genealogical Interpretation of Principal Components Analysis
title_full A Genealogical Interpretation of Principal Components Analysis
title_fullStr A Genealogical Interpretation of Principal Components Analysis
title_full_unstemmed A Genealogical Interpretation of Principal Components Analysis
title_short A Genealogical Interpretation of Principal Components Analysis
title_sort genealogical interpretation of principal components analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2757795/
https://www.ncbi.nlm.nih.gov/pubmed/19834557
http://dx.doi.org/10.1371/journal.pgen.1000686
work_keys_str_mv AT mcveangil agenealogicalinterpretationofprincipalcomponentsanalysis
AT mcveangil genealogicalinterpretationofprincipalcomponentsanalysis