Cargando…

netReg: network-regularized linear models for biological association studies

SUMMARY: Modelling biological associations or dependencies using linear regression is often complicated when the analyzed data-sets are high-dimensional and less observations than variables are available (n ≪ p). For genomic data-sets penalized regression methods have been applied settling this issu...

Descripción completa

Detalles Bibliográficos
Autores principales: Dirmeier, Simon, Fuchs, Christiane, Mueller, Nikola S, Theis, Fabian J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6030897/
https://www.ncbi.nlm.nih.gov/pubmed/29077797
http://dx.doi.org/10.1093/bioinformatics/btx677
_version_ 1783337217039532032
author Dirmeier, Simon
Fuchs, Christiane
Mueller, Nikola S
Theis, Fabian J
author_facet Dirmeier, Simon
Fuchs, Christiane
Mueller, Nikola S
Theis, Fabian J
author_sort Dirmeier, Simon
collection PubMed
description SUMMARY: Modelling biological associations or dependencies using linear regression is often complicated when the analyzed data-sets are high-dimensional and less observations than variables are available (n ≪ p). For genomic data-sets penalized regression methods have been applied settling this issue. Recently proposed regression models utilize prior knowledge on dependencies, e.g. in the form of graphs, arguing that this information will lead to more reliable estimates for regression coefficients. However, none of the proposed models for multivariate genomic response variables have been implemented as a computationally efficient, freely available library. In this paper we propose netReg, a package for graph-penalized regression models that use large networks and thousands of variables. netReg incorporates a priori generated biological graph information into linear models yielding sparse or smooth solutions for regression coefficients. AVAILABILITY AND IMPLEMENTATION: netReg is implemented as both R-package and C ++ commandline tool. The main computations are done in C ++, where we use Armadillo for fast matrix calculations and Dlib for optimization. The R package is freely available on Bioconductorhttps://bioconductor.org/packages/netReg. The command line tool can be installed using the conda channel Bioconda. Installation details, issue reports, development versions, documentation and tutorials for the R and C ++ versions and the R package vignette can be found on GitHub https://dirmeier.github.io/netReg/. The GitHub page also contains code for benchmarking and example datasets used in this paper.
format Online
Article
Text
id pubmed-6030897
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-60308972018-07-10 netReg: network-regularized linear models for biological association studies Dirmeier, Simon Fuchs, Christiane Mueller, Nikola S Theis, Fabian J Bioinformatics Applications Notes SUMMARY: Modelling biological associations or dependencies using linear regression is often complicated when the analyzed data-sets are high-dimensional and less observations than variables are available (n ≪ p). For genomic data-sets penalized regression methods have been applied settling this issue. Recently proposed regression models utilize prior knowledge on dependencies, e.g. in the form of graphs, arguing that this information will lead to more reliable estimates for regression coefficients. However, none of the proposed models for multivariate genomic response variables have been implemented as a computationally efficient, freely available library. In this paper we propose netReg, a package for graph-penalized regression models that use large networks and thousands of variables. netReg incorporates a priori generated biological graph information into linear models yielding sparse or smooth solutions for regression coefficients. AVAILABILITY AND IMPLEMENTATION: netReg is implemented as both R-package and C ++ commandline tool. The main computations are done in C ++, where we use Armadillo for fast matrix calculations and Dlib for optimization. The R package is freely available on Bioconductorhttps://bioconductor.org/packages/netReg. The command line tool can be installed using the conda channel Bioconda. Installation details, issue reports, development versions, documentation and tutorials for the R and C ++ versions and the R package vignette can be found on GitHub https://dirmeier.github.io/netReg/. The GitHub page also contains code for benchmarking and example datasets used in this paper. Oxford University Press 2018-03-01 2017-10-25 /pmc/articles/PMC6030897/ /pubmed/29077797 http://dx.doi.org/10.1093/bioinformatics/btx677 Text en © The Author 2017. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Applications Notes
Dirmeier, Simon
Fuchs, Christiane
Mueller, Nikola S
Theis, Fabian J
netReg: network-regularized linear models for biological association studies
title netReg: network-regularized linear models for biological association studies
title_full netReg: network-regularized linear models for biological association studies
title_fullStr netReg: network-regularized linear models for biological association studies
title_full_unstemmed netReg: network-regularized linear models for biological association studies
title_short netReg: network-regularized linear models for biological association studies
title_sort netreg: network-regularized linear models for biological association studies
topic Applications Notes
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6030897/
https://www.ncbi.nlm.nih.gov/pubmed/29077797
http://dx.doi.org/10.1093/bioinformatics/btx677
work_keys_str_mv AT dirmeiersimon netregnetworkregularizedlinearmodelsforbiologicalassociationstudies
AT fuchschristiane netregnetworkregularizedlinearmodelsforbiologicalassociationstudies
AT muellernikolas netregnetworkregularizedlinearmodelsforbiologicalassociationstudies
AT theisfabianj netregnetworkregularizedlinearmodelsforbiologicalassociationstudies