Cargando…

Multi-population GWA mapping via multi-task regularized regression

Motivation: Population heterogeneity through admixing of different founder populations can produce spurious associations in genome- wide association studies that are linked to the population structure rather than the phenotype. Since samples from the same population generally co-evolve, different po...

Descripción completa

Detalles Bibliográficos
Autores principales: Puniyani, Kriti, Kim, Seyoung, Xing, Eric P.
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2881376/
https://www.ncbi.nlm.nih.gov/pubmed/20529908
http://dx.doi.org/10.1093/bioinformatics/btq191
_version_ 1782182109864001536
author Puniyani, Kriti
Kim, Seyoung
Xing, Eric P.
author_facet Puniyani, Kriti
Kim, Seyoung
Xing, Eric P.
author_sort Puniyani, Kriti
collection PubMed
description Motivation: Population heterogeneity through admixing of different founder populations can produce spurious associations in genome- wide association studies that are linked to the population structure rather than the phenotype. Since samples from the same population generally co-evolve, different populations may or may not share the same genetic underpinnings for the seemingly common phenotype. Our goal is to develop a unified framework for detecting causal genetic markers through a joint association analysis of multiple populations. Results: Based on a multi-task regression principle, we present a multi-population group lasso algorithm using L(1)/L(2)-regularized regression for joint association analysis of multiple populations that are stratified either via population survey or computational estimation. Our algorithm combines information from genetic markers across populations, to identify causal markers. It also implicitly accounts for correlations between the genetic markers, thus enabling better control over false positive rates. Joint analysis across populations enables the detection of weak associations common to all populations with greater power than in a separate analysis of each population. At the same time, the regression-based framework allows causal alleles that are unique to a subset of the populations to be correctly identified. We demonstrate the effectiveness of our method on HapMap-simulated and lactase persistence datasets, where we significantly outperform state of the art methods, with greater power for detecting weak associations and reduced spurious associations. Availability: Software will be available at http://www.sailing.cs.cmu.edu/ Contact: epxing@cs.cmu.edu
format Text
id pubmed-2881376
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-28813762010-06-08 Multi-population GWA mapping via multi-task regularized regression Puniyani, Kriti Kim, Seyoung Xing, Eric P. Bioinformatics Ismb 2010 Conference Proceedings July 11 to July 13, 2010, Boston, Ma, Usa Motivation: Population heterogeneity through admixing of different founder populations can produce spurious associations in genome- wide association studies that are linked to the population structure rather than the phenotype. Since samples from the same population generally co-evolve, different populations may or may not share the same genetic underpinnings for the seemingly common phenotype. Our goal is to develop a unified framework for detecting causal genetic markers through a joint association analysis of multiple populations. Results: Based on a multi-task regression principle, we present a multi-population group lasso algorithm using L(1)/L(2)-regularized regression for joint association analysis of multiple populations that are stratified either via population survey or computational estimation. Our algorithm combines information from genetic markers across populations, to identify causal markers. It also implicitly accounts for correlations between the genetic markers, thus enabling better control over false positive rates. Joint analysis across populations enables the detection of weak associations common to all populations with greater power than in a separate analysis of each population. At the same time, the regression-based framework allows causal alleles that are unique to a subset of the populations to be correctly identified. We demonstrate the effectiveness of our method on HapMap-simulated and lactase persistence datasets, where we significantly outperform state of the art methods, with greater power for detecting weak associations and reduced spurious associations. Availability: Software will be available at http://www.sailing.cs.cmu.edu/ Contact: epxing@cs.cmu.edu Oxford University Press 2010-06-15 2010-06-01 /pmc/articles/PMC2881376/ /pubmed/20529908 http://dx.doi.org/10.1093/bioinformatics/btq191 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Ismb 2010 Conference Proceedings July 11 to July 13, 2010, Boston, Ma, Usa
Puniyani, Kriti
Kim, Seyoung
Xing, Eric P.
Multi-population GWA mapping via multi-task regularized regression
title Multi-population GWA mapping via multi-task regularized regression
title_full Multi-population GWA mapping via multi-task regularized regression
title_fullStr Multi-population GWA mapping via multi-task regularized regression
title_full_unstemmed Multi-population GWA mapping via multi-task regularized regression
title_short Multi-population GWA mapping via multi-task regularized regression
title_sort multi-population gwa mapping via multi-task regularized regression
topic Ismb 2010 Conference Proceedings July 11 to July 13, 2010, Boston, Ma, Usa
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2881376/
https://www.ncbi.nlm.nih.gov/pubmed/20529908
http://dx.doi.org/10.1093/bioinformatics/btq191
work_keys_str_mv AT puniyanikriti multipopulationgwamappingviamultitaskregularizedregression
AT kimseyoung multipopulationgwamappingviamultitaskregularizedregression
AT xingericp multipopulationgwamappingviamultitaskregularizedregression