Cargando…

Correcting for Population Structure and Kinship Using the Linear Mixed Model: Theory and Extensions

Population structure and kinship are widespread confounding factors in genome-wide association studies (GWAS). It has been standard practice to include principal components of the genotypes in a regression model in order to account for population structure. More recently, the linear mixed model (LMM...

Descripción completa

Detalles Bibliográficos
Autor principal: Hoffman, Gabriel E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3810480/
https://www.ncbi.nlm.nih.gov/pubmed/24204578
http://dx.doi.org/10.1371/journal.pone.0075707
_version_ 1782288793429082112
author Hoffman, Gabriel E.
author_facet Hoffman, Gabriel E.
author_sort Hoffman, Gabriel E.
collection PubMed
description Population structure and kinship are widespread confounding factors in genome-wide association studies (GWAS). It has been standard practice to include principal components of the genotypes in a regression model in order to account for population structure. More recently, the linear mixed model (LMM) has emerged as a powerful method for simultaneously accounting for population structure and kinship. The statistical theory underlying the differences in empirical performance between modeling principal components as fixed versus random effects has not been thoroughly examined. We undertake an analysis to formalize the relationship between these widely used methods and elucidate the statistical properties of each. Moreover, we introduce a new statistic, effective degrees of freedom, that serves as a metric of model complexity and a novel low rank linear mixed model (LRLMM) to learn the dimensionality of the correction for population structure and kinship, and we assess its performance through simulations. A comparison of the results of LRLMM and a standard LMM analysis applied to GWAS data from the Multi-Ethnic Study of Atherosclerosis (MESA) illustrates how our theoretical results translate into empirical properties of the mixed model. Finally, the analysis demonstrates the ability of the LRLMM to substantially boost the strength of an association for HDL cholesterol in Europeans.
format Online
Article
Text
id pubmed-3810480
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-38104802013-11-07 Correcting for Population Structure and Kinship Using the Linear Mixed Model: Theory and Extensions Hoffman, Gabriel E. PLoS One Research Article Population structure and kinship are widespread confounding factors in genome-wide association studies (GWAS). It has been standard practice to include principal components of the genotypes in a regression model in order to account for population structure. More recently, the linear mixed model (LMM) has emerged as a powerful method for simultaneously accounting for population structure and kinship. The statistical theory underlying the differences in empirical performance between modeling principal components as fixed versus random effects has not been thoroughly examined. We undertake an analysis to formalize the relationship between these widely used methods and elucidate the statistical properties of each. Moreover, we introduce a new statistic, effective degrees of freedom, that serves as a metric of model complexity and a novel low rank linear mixed model (LRLMM) to learn the dimensionality of the correction for population structure and kinship, and we assess its performance through simulations. A comparison of the results of LRLMM and a standard LMM analysis applied to GWAS data from the Multi-Ethnic Study of Atherosclerosis (MESA) illustrates how our theoretical results translate into empirical properties of the mixed model. Finally, the analysis demonstrates the ability of the LRLMM to substantially boost the strength of an association for HDL cholesterol in Europeans. Public Library of Science 2013-10-28 /pmc/articles/PMC3810480/ /pubmed/24204578 http://dx.doi.org/10.1371/journal.pone.0075707 Text en © 2013 Gabriel E http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Hoffman, Gabriel E.
Correcting for Population Structure and Kinship Using the Linear Mixed Model: Theory and Extensions
title Correcting for Population Structure and Kinship Using the Linear Mixed Model: Theory and Extensions
title_full Correcting for Population Structure and Kinship Using the Linear Mixed Model: Theory and Extensions
title_fullStr Correcting for Population Structure and Kinship Using the Linear Mixed Model: Theory and Extensions
title_full_unstemmed Correcting for Population Structure and Kinship Using the Linear Mixed Model: Theory and Extensions
title_short Correcting for Population Structure and Kinship Using the Linear Mixed Model: Theory and Extensions
title_sort correcting for population structure and kinship using the linear mixed model: theory and extensions
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3810480/
https://www.ncbi.nlm.nih.gov/pubmed/24204578
http://dx.doi.org/10.1371/journal.pone.0075707
work_keys_str_mv AT hoffmangabriele correctingforpopulationstructureandkinshipusingthelinearmixedmodeltheoryandextensions