Cargando…

Overlapping Group Logistic Regression with Applications to Genetic Pathway Selection

Discovering important genes that account for the phenotype of interest has long been a challenge in genome-wide expression analysis. Analyses such as gene set enrichment analysis (GSEA) that incorporate pathway information have become widespread in hypothesis testing, but pathway-based approaches ha...

Descripción completa

Detalles Bibliográficos
Autores principales: Zeng, Yaohui, Breheny, Patrick
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Libertas Academica 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5026200/
https://www.ncbi.nlm.nih.gov/pubmed/27679461
http://dx.doi.org/10.4137/CIN.S40043
_version_ 1782454092863373312
author Zeng, Yaohui
Breheny, Patrick
author_facet Zeng, Yaohui
Breheny, Patrick
author_sort Zeng, Yaohui
collection PubMed
description Discovering important genes that account for the phenotype of interest has long been a challenge in genome-wide expression analysis. Analyses such as gene set enrichment analysis (GSEA) that incorporate pathway information have become widespread in hypothesis testing, but pathway-based approaches have been largely absent from regression methods due to the challenges of dealing with overlapping pathways and the resulting lack of available software. The R package grpreg is widely used to fit group lasso and other group-penalized regression models; in this study, we develop an extension, grpregOverlap, to allow for overlapping group structure using a latent variable approach. We compare this approach to the ordinary lasso and to GSEA using both simulated and real data. We find that incorporation of prior pathway information can substantially improve the accuracy of gene expression classifiers, and we shed light on several ways in which hypothesis-testing approaches such as GSEA differ from regression approaches with respect to the analysis of pathway data.
format Online
Article
Text
id pubmed-5026200
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Libertas Academica
record_format MEDLINE/PubMed
spelling pubmed-50262002016-09-27 Overlapping Group Logistic Regression with Applications to Genetic Pathway Selection Zeng, Yaohui Breheny, Patrick Cancer Inform Methodology Discovering important genes that account for the phenotype of interest has long been a challenge in genome-wide expression analysis. Analyses such as gene set enrichment analysis (GSEA) that incorporate pathway information have become widespread in hypothesis testing, but pathway-based approaches have been largely absent from regression methods due to the challenges of dealing with overlapping pathways and the resulting lack of available software. The R package grpreg is widely used to fit group lasso and other group-penalized regression models; in this study, we develop an extension, grpregOverlap, to allow for overlapping group structure using a latent variable approach. We compare this approach to the ordinary lasso and to GSEA using both simulated and real data. We find that incorporation of prior pathway information can substantially improve the accuracy of gene expression classifiers, and we shed light on several ways in which hypothesis-testing approaches such as GSEA differ from regression approaches with respect to the analysis of pathway data. Libertas Academica 2016-09-15 /pmc/articles/PMC5026200/ /pubmed/27679461 http://dx.doi.org/10.4137/CIN.S40043 Text en © 2016 the author(s), publisher and licensee Libertas Academica Ltd. This is an open-access article distributed under the terms of the Creative Commons CC-BY-NC 3.0 License.
spellingShingle Methodology
Zeng, Yaohui
Breheny, Patrick
Overlapping Group Logistic Regression with Applications to Genetic Pathway Selection
title Overlapping Group Logistic Regression with Applications to Genetic Pathway Selection
title_full Overlapping Group Logistic Regression with Applications to Genetic Pathway Selection
title_fullStr Overlapping Group Logistic Regression with Applications to Genetic Pathway Selection
title_full_unstemmed Overlapping Group Logistic Regression with Applications to Genetic Pathway Selection
title_short Overlapping Group Logistic Regression with Applications to Genetic Pathway Selection
title_sort overlapping group logistic regression with applications to genetic pathway selection
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5026200/
https://www.ncbi.nlm.nih.gov/pubmed/27679461
http://dx.doi.org/10.4137/CIN.S40043
work_keys_str_mv AT zengyaohui overlappinggrouplogisticregressionwithapplicationstogeneticpathwayselection
AT brehenypatrick overlappinggrouplogisticregressionwithapplicationstogeneticpathwayselection