Cargando…
Gene set selection via LASSO penalized regression (SLPR)
Gene set testing is an important bioinformatics technique that addresses the challenges of power, interpretation and replication. To better support the analysis of large and highly overlapping gene set collections, researchers have recently developed a number of multiset methods that jointly evaluat...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5499546/ https://www.ncbi.nlm.nih.gov/pubmed/28472344 http://dx.doi.org/10.1093/nar/gkx291 |
_version_ | 1783248490754736128 |
---|---|
author | Frost, H. Robert Amos, Christopher I. |
author_facet | Frost, H. Robert Amos, Christopher I. |
author_sort | Frost, H. Robert |
collection | PubMed |
description | Gene set testing is an important bioinformatics technique that addresses the challenges of power, interpretation and replication. To better support the analysis of large and highly overlapping gene set collections, researchers have recently developed a number of multiset methods that jointly evaluate all gene sets in a collection to identify a parsimonious group of functionally independent sets. Unfortunately, current multiset methods all use binary indicators for gene and gene set activity and assume that a gene is active if any containing gene set is active. This simplistic model limits performance on many types of genomic data. To address this limitation, we developed gene set Selection via LASSO Penalized Regression (SLPR), a novel mapping of multiset gene set testing to penalized multiple linear regression. The SLPR method assumes a linear relationship between continuous measures of gene activity and the activity of all gene sets in the collection. As we demonstrate via simulation studies and the analysis of TCGA data using MSigDB gene sets, the SLPR method outperforms existing multiset methods when the true biological process is well approximated by continuous activity measures and a linear association between genes and gene sets. |
format | Online Article Text |
id | pubmed-5499546 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-54995462017-07-10 Gene set selection via LASSO penalized regression (SLPR) Frost, H. Robert Amos, Christopher I. Nucleic Acids Res Methods Online Gene set testing is an important bioinformatics technique that addresses the challenges of power, interpretation and replication. To better support the analysis of large and highly overlapping gene set collections, researchers have recently developed a number of multiset methods that jointly evaluate all gene sets in a collection to identify a parsimonious group of functionally independent sets. Unfortunately, current multiset methods all use binary indicators for gene and gene set activity and assume that a gene is active if any containing gene set is active. This simplistic model limits performance on many types of genomic data. To address this limitation, we developed gene set Selection via LASSO Penalized Regression (SLPR), a novel mapping of multiset gene set testing to penalized multiple linear regression. The SLPR method assumes a linear relationship between continuous measures of gene activity and the activity of all gene sets in the collection. As we demonstrate via simulation studies and the analysis of TCGA data using MSigDB gene sets, the SLPR method outperforms existing multiset methods when the true biological process is well approximated by continuous activity measures and a linear association between genes and gene sets. Oxford University Press 2017-07-07 2017-05-02 /pmc/articles/PMC5499546/ /pubmed/28472344 http://dx.doi.org/10.1093/nar/gkx291 Text en © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methods Online Frost, H. Robert Amos, Christopher I. Gene set selection via LASSO penalized regression (SLPR) |
title | Gene set selection via LASSO penalized regression (SLPR) |
title_full | Gene set selection via LASSO penalized regression (SLPR) |
title_fullStr | Gene set selection via LASSO penalized regression (SLPR) |
title_full_unstemmed | Gene set selection via LASSO penalized regression (SLPR) |
title_short | Gene set selection via LASSO penalized regression (SLPR) |
title_sort | gene set selection via lasso penalized regression (slpr) |
topic | Methods Online |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5499546/ https://www.ncbi.nlm.nih.gov/pubmed/28472344 http://dx.doi.org/10.1093/nar/gkx291 |
work_keys_str_mv | AT frosthrobert genesetselectionvialassopenalizedregressionslpr AT amoschristopheri genesetselectionvialassopenalizedregressionslpr |