Cargando…

Iterative Usage of Fixed and Random Effect Models for Powerful and Efficient Genome-Wide Association Studies

False positives in a Genome-Wide Association Study (GWAS) can be effectively controlled by a fixed effect and random effect Mixed Linear Model (MLM) that incorporates population structure and kinship among individuals to adjust association tests on markers; however, the adjustment also compromises t...

Descripción completa

Detalles Bibliográficos
Autores principales:	Liu, Xiaolei, Huang, Meng, Fan, Bin, Buckler, Edward S., Zhang, Zhiwu
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2016
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4734661/ https://www.ncbi.nlm.nih.gov/pubmed/26828793 http://dx.doi.org/10.1371/journal.pgen.1005767

_version_	1782412946567069696
author	Liu, Xiaolei Huang, Meng Fan, Bin Buckler, Edward S. Zhang, Zhiwu
author_facet	Liu, Xiaolei Huang, Meng Fan, Bin Buckler, Edward S. Zhang, Zhiwu
author_sort	Liu, Xiaolei
collection	PubMed
description	False positives in a Genome-Wide Association Study (GWAS) can be effectively controlled by a fixed effect and random effect Mixed Linear Model (MLM) that incorporates population structure and kinship among individuals to adjust association tests on markers; however, the adjustment also compromises true positives. The modified MLM method, Multiple Loci Linear Mixed Model (MLMM), incorporates multiple markers simultaneously as covariates in a stepwise MLM to partially remove the confounding between testing markers and kinship. To completely eliminate the confounding, we divided MLMM into two parts: Fixed Effect Model (FEM) and a Random Effect Model (REM) and use them iteratively. FEM contains testing markers, one at a time, and multiple associated markers as covariates to control false positives. To avoid model over-fitting problem in FEM, the associated markers are estimated in REM by using them to define kinship. The P values of testing markers and the associated markers are unified at each iteration. We named the new method as Fixed and random model Circulating Probability Unification (FarmCPU). Both real and simulated data analyses demonstrated that FarmCPU improves statistical power compared to current methods. Additional benefits include an efficient computing time that is linear to both number of individuals and number of markers. Now, a dataset with half million individuals and half million markers can be analyzed within three days.
format	Online Article Text
id	pubmed-4734661
institution	National Center for Biotechnology Information
language	English
publishDate	2016
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-47346612016-02-04 Iterative Usage of Fixed and Random Effect Models for Powerful and Efficient Genome-Wide Association Studies Liu, Xiaolei Huang, Meng Fan, Bin Buckler, Edward S. Zhang, Zhiwu PLoS Genet Research Article False positives in a Genome-Wide Association Study (GWAS) can be effectively controlled by a fixed effect and random effect Mixed Linear Model (MLM) that incorporates population structure and kinship among individuals to adjust association tests on markers; however, the adjustment also compromises true positives. The modified MLM method, Multiple Loci Linear Mixed Model (MLMM), incorporates multiple markers simultaneously as covariates in a stepwise MLM to partially remove the confounding between testing markers and kinship. To completely eliminate the confounding, we divided MLMM into two parts: Fixed Effect Model (FEM) and a Random Effect Model (REM) and use them iteratively. FEM contains testing markers, one at a time, and multiple associated markers as covariates to control false positives. To avoid model over-fitting problem in FEM, the associated markers are estimated in REM by using them to define kinship. The P values of testing markers and the associated markers are unified at each iteration. We named the new method as Fixed and random model Circulating Probability Unification (FarmCPU). Both real and simulated data analyses demonstrated that FarmCPU improves statistical power compared to current methods. Additional benefits include an efficient computing time that is linear to both number of individuals and number of markers. Now, a dataset with half million individuals and half million markers can be analyzed within three days. Public Library of Science 2016-02-01 /pmc/articles/PMC4734661/ /pubmed/26828793 http://dx.doi.org/10.1371/journal.pgen.1005767 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 (https://creativecommons.org/publicdomain/zero/1.0/) public domain dedication.
spellingShingle	Research Article Liu, Xiaolei Huang, Meng Fan, Bin Buckler, Edward S. Zhang, Zhiwu Iterative Usage of Fixed and Random Effect Models for Powerful and Efficient Genome-Wide Association Studies
title	Iterative Usage of Fixed and Random Effect Models for Powerful and Efficient Genome-Wide Association Studies
title_full	Iterative Usage of Fixed and Random Effect Models for Powerful and Efficient Genome-Wide Association Studies
title_fullStr	Iterative Usage of Fixed and Random Effect Models for Powerful and Efficient Genome-Wide Association Studies
title_full_unstemmed	Iterative Usage of Fixed and Random Effect Models for Powerful and Efficient Genome-Wide Association Studies
title_short	Iterative Usage of Fixed and Random Effect Models for Powerful and Efficient Genome-Wide Association Studies
title_sort	iterative usage of fixed and random effect models for powerful and efficient genome-wide association studies
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4734661/ https://www.ncbi.nlm.nih.gov/pubmed/26828793 http://dx.doi.org/10.1371/journal.pgen.1005767
work_keys_str_mv	AT liuxiaolei iterativeusageoffixedandrandomeffectmodelsforpowerfulandefficientgenomewideassociationstudies AT huangmeng iterativeusageoffixedandrandomeffectmodelsforpowerfulandefficientgenomewideassociationstudies AT fanbin iterativeusageoffixedandrandomeffectmodelsforpowerfulandefficientgenomewideassociationstudies AT buckleredwards iterativeusageoffixedandrandomeffectmodelsforpowerfulandefficientgenomewideassociationstudies AT zhangzhiwu iterativeusageoffixedandrandomeffectmodelsforpowerfulandefficientgenomewideassociationstudies

Iterative Usage of Fixed and Random Effect Models for Powerful and Efficient Genome-Wide Association Studies

Ejemplares similares