Cargando…
PUMA: A Unified Framework for Penalized Multiple Regression Analysis of GWAS Data
Penalized Multiple Regression (PMR) can be used to discover novel disease associations in GWAS datasets. In practice, proposed PMR methods have not been able to identify well-supported associations in GWAS that are undetectable by standard association tests and thus these methods are not widely appl...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3694815/ https://www.ncbi.nlm.nih.gov/pubmed/23825936 http://dx.doi.org/10.1371/journal.pcbi.1003101 |
_version_ | 1782274892263063552 |
---|---|
author | Hoffman, Gabriel E. Logsdon, Benjamin A. Mezey, Jason G. |
author_facet | Hoffman, Gabriel E. Logsdon, Benjamin A. Mezey, Jason G. |
author_sort | Hoffman, Gabriel E. |
collection | PubMed |
description | Penalized Multiple Regression (PMR) can be used to discover novel disease associations in GWAS datasets. In practice, proposed PMR methods have not been able to identify well-supported associations in GWAS that are undetectable by standard association tests and thus these methods are not widely applied. Here, we present a combined algorithmic and heuristic framework for PUMA (Penalized Unified Multiple-locus Association) analysis that solves the problems of previously proposed methods including computational speed, poor performance on genome-scale simulated data, and identification of too many associations for real data to be biologically plausible. The framework includes a new minorize-maximization (MM) algorithm for generalized linear models (GLM) combined with heuristic model selection and testing methods for identification of robust associations. The PUMA framework implements the penalized maximum likelihood penalties previously proposed for GWAS analysis (i.e. Lasso, Adaptive Lasso, NEG, MCP), as well as a penalty that has not been previously applied to GWAS (i.e. LOG). Using simulations that closely mirror real GWAS data, we show that our framework has high performance and reliably increases power to detect weak associations, while existing PMR methods can perform worse than single marker testing in overall performance. To demonstrate the empirical value of PUMA, we analyzed GWAS data for type 1 diabetes, Crohns's disease, and rheumatoid arthritis, three autoimmune diseases from the original Wellcome Trust Case Control Consortium. Our analysis replicates known associations for these diseases and we discover novel etiologically relevant susceptibility loci that are invisible to standard single marker tests, including six novel associations implicating genes involved in pancreatic function, insulin pathways and immune-cell function in type 1 diabetes; three novel associations implicating genes in pro- and anti-inflammatory pathways in Crohn's disease; and one novel association implicating a gene involved in apoptosis pathways in rheumatoid arthritis. We provide software for applying our PUMA analysis framework. |
format | Online Article Text |
id | pubmed-3694815 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-36948152013-07-03 PUMA: A Unified Framework for Penalized Multiple Regression Analysis of GWAS Data Hoffman, Gabriel E. Logsdon, Benjamin A. Mezey, Jason G. PLoS Comput Biol Research Article Penalized Multiple Regression (PMR) can be used to discover novel disease associations in GWAS datasets. In practice, proposed PMR methods have not been able to identify well-supported associations in GWAS that are undetectable by standard association tests and thus these methods are not widely applied. Here, we present a combined algorithmic and heuristic framework for PUMA (Penalized Unified Multiple-locus Association) analysis that solves the problems of previously proposed methods including computational speed, poor performance on genome-scale simulated data, and identification of too many associations for real data to be biologically plausible. The framework includes a new minorize-maximization (MM) algorithm for generalized linear models (GLM) combined with heuristic model selection and testing methods for identification of robust associations. The PUMA framework implements the penalized maximum likelihood penalties previously proposed for GWAS analysis (i.e. Lasso, Adaptive Lasso, NEG, MCP), as well as a penalty that has not been previously applied to GWAS (i.e. LOG). Using simulations that closely mirror real GWAS data, we show that our framework has high performance and reliably increases power to detect weak associations, while existing PMR methods can perform worse than single marker testing in overall performance. To demonstrate the empirical value of PUMA, we analyzed GWAS data for type 1 diabetes, Crohns's disease, and rheumatoid arthritis, three autoimmune diseases from the original Wellcome Trust Case Control Consortium. Our analysis replicates known associations for these diseases and we discover novel etiologically relevant susceptibility loci that are invisible to standard single marker tests, including six novel associations implicating genes involved in pancreatic function, insulin pathways and immune-cell function in type 1 diabetes; three novel associations implicating genes in pro- and anti-inflammatory pathways in Crohn's disease; and one novel association implicating a gene involved in apoptosis pathways in rheumatoid arthritis. We provide software for applying our PUMA analysis framework. Public Library of Science 2013-06-27 /pmc/articles/PMC3694815/ /pubmed/23825936 http://dx.doi.org/10.1371/journal.pcbi.1003101 Text en © 2013 Hoffman et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Hoffman, Gabriel E. Logsdon, Benjamin A. Mezey, Jason G. PUMA: A Unified Framework for Penalized Multiple Regression Analysis of GWAS Data |
title | PUMA: A Unified Framework for Penalized Multiple Regression Analysis of GWAS Data |
title_full | PUMA: A Unified Framework for Penalized Multiple Regression Analysis of GWAS Data |
title_fullStr | PUMA: A Unified Framework for Penalized Multiple Regression Analysis of GWAS Data |
title_full_unstemmed | PUMA: A Unified Framework for Penalized Multiple Regression Analysis of GWAS Data |
title_short | PUMA: A Unified Framework for Penalized Multiple Regression Analysis of GWAS Data |
title_sort | puma: a unified framework for penalized multiple regression analysis of gwas data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3694815/ https://www.ncbi.nlm.nih.gov/pubmed/23825936 http://dx.doi.org/10.1371/journal.pcbi.1003101 |
work_keys_str_mv | AT hoffmangabriele pumaaunifiedframeworkforpenalizedmultipleregressionanalysisofgwasdata AT logsdonbenjamina pumaaunifiedframeworkforpenalizedmultipleregressionanalysisofgwasdata AT mezeyjasong pumaaunifiedframeworkforpenalizedmultipleregressionanalysisofgwasdata |