Cargando…
Multi-population GWA mapping via multi-task regularized regression
Motivation: Population heterogeneity through admixing of different founder populations can produce spurious associations in genome- wide association studies that are linked to the population structure rather than the phenotype. Since samples from the same population generally co-evolve, different po...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2881376/ https://www.ncbi.nlm.nih.gov/pubmed/20529908 http://dx.doi.org/10.1093/bioinformatics/btq191 |
_version_ | 1782182109864001536 |
---|---|
author | Puniyani, Kriti Kim, Seyoung Xing, Eric P. |
author_facet | Puniyani, Kriti Kim, Seyoung Xing, Eric P. |
author_sort | Puniyani, Kriti |
collection | PubMed |
description | Motivation: Population heterogeneity through admixing of different founder populations can produce spurious associations in genome- wide association studies that are linked to the population structure rather than the phenotype. Since samples from the same population generally co-evolve, different populations may or may not share the same genetic underpinnings for the seemingly common phenotype. Our goal is to develop a unified framework for detecting causal genetic markers through a joint association analysis of multiple populations. Results: Based on a multi-task regression principle, we present a multi-population group lasso algorithm using L(1)/L(2)-regularized regression for joint association analysis of multiple populations that are stratified either via population survey or computational estimation. Our algorithm combines information from genetic markers across populations, to identify causal markers. It also implicitly accounts for correlations between the genetic markers, thus enabling better control over false positive rates. Joint analysis across populations enables the detection of weak associations common to all populations with greater power than in a separate analysis of each population. At the same time, the regression-based framework allows causal alleles that are unique to a subset of the populations to be correctly identified. We demonstrate the effectiveness of our method on HapMap-simulated and lactase persistence datasets, where we significantly outperform state of the art methods, with greater power for detecting weak associations and reduced spurious associations. Availability: Software will be available at http://www.sailing.cs.cmu.edu/ Contact: epxing@cs.cmu.edu |
format | Text |
id | pubmed-2881376 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-28813762010-06-08 Multi-population GWA mapping via multi-task regularized regression Puniyani, Kriti Kim, Seyoung Xing, Eric P. Bioinformatics Ismb 2010 Conference Proceedings July 11 to July 13, 2010, Boston, Ma, Usa Motivation: Population heterogeneity through admixing of different founder populations can produce spurious associations in genome- wide association studies that are linked to the population structure rather than the phenotype. Since samples from the same population generally co-evolve, different populations may or may not share the same genetic underpinnings for the seemingly common phenotype. Our goal is to develop a unified framework for detecting causal genetic markers through a joint association analysis of multiple populations. Results: Based on a multi-task regression principle, we present a multi-population group lasso algorithm using L(1)/L(2)-regularized regression for joint association analysis of multiple populations that are stratified either via population survey or computational estimation. Our algorithm combines information from genetic markers across populations, to identify causal markers. It also implicitly accounts for correlations between the genetic markers, thus enabling better control over false positive rates. Joint analysis across populations enables the detection of weak associations common to all populations with greater power than in a separate analysis of each population. At the same time, the regression-based framework allows causal alleles that are unique to a subset of the populations to be correctly identified. We demonstrate the effectiveness of our method on HapMap-simulated and lactase persistence datasets, where we significantly outperform state of the art methods, with greater power for detecting weak associations and reduced spurious associations. Availability: Software will be available at http://www.sailing.cs.cmu.edu/ Contact: epxing@cs.cmu.edu Oxford University Press 2010-06-15 2010-06-01 /pmc/articles/PMC2881376/ /pubmed/20529908 http://dx.doi.org/10.1093/bioinformatics/btq191 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Ismb 2010 Conference Proceedings July 11 to July 13, 2010, Boston, Ma, Usa Puniyani, Kriti Kim, Seyoung Xing, Eric P. Multi-population GWA mapping via multi-task regularized regression |
title | Multi-population GWA mapping via multi-task regularized regression |
title_full | Multi-population GWA mapping via multi-task regularized regression |
title_fullStr | Multi-population GWA mapping via multi-task regularized regression |
title_full_unstemmed | Multi-population GWA mapping via multi-task regularized regression |
title_short | Multi-population GWA mapping via multi-task regularized regression |
title_sort | multi-population gwa mapping via multi-task regularized regression |
topic | Ismb 2010 Conference Proceedings July 11 to July 13, 2010, Boston, Ma, Usa |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2881376/ https://www.ncbi.nlm.nih.gov/pubmed/20529908 http://dx.doi.org/10.1093/bioinformatics/btq191 |
work_keys_str_mv | AT puniyanikriti multipopulationgwamappingviamultitaskregularizedregression AT kimseyoung multipopulationgwamappingviamultitaskregularizedregression AT xingericp multipopulationgwamappingviamultitaskregularizedregression |