Cargando…

Tailored graphical lasso for data integration in gene network reconstruction

BACKGROUND: Identifying gene interactions is a topic of great importance in genomics, and approaches based on network models provide a powerful tool for studying these. Assuming a Gaussian graphical model, a gene association network may be estimated from multiomic data based on the non-zero entries...

Descripción completa

Detalles Bibliográficos
Autores principales: Lingjærde, Camilla, Lien, Tonje G., Borgan, Ørnulf, Bergholtz, Helga, Glad, Ingrid K.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8518261/
https://www.ncbi.nlm.nih.gov/pubmed/34654363
http://dx.doi.org/10.1186/s12859-021-04413-z
_version_ 1784584185500925952
author Lingjærde, Camilla
Lien, Tonje G.
Borgan, Ørnulf
Bergholtz, Helga
Glad, Ingrid K.
author_facet Lingjærde, Camilla
Lien, Tonje G.
Borgan, Ørnulf
Bergholtz, Helga
Glad, Ingrid K.
author_sort Lingjærde, Camilla
collection PubMed
description BACKGROUND: Identifying gene interactions is a topic of great importance in genomics, and approaches based on network models provide a powerful tool for studying these. Assuming a Gaussian graphical model, a gene association network may be estimated from multiomic data based on the non-zero entries of the inverse covariance matrix. Inferring such biological networks is challenging because of the high dimensionality of the problem, making traditional estimators unsuitable. The graphical lasso is constructed for the estimation of sparse inverse covariance matrices in such situations, using [Formula: see text] -penalization on the matrix entries. The weighted graphical lasso is an extension in which prior biological information from other sources is integrated into the model. There are however issues with this approach, as it naïvely forces the prior information into the network estimation, even if it is misleading or does not agree with the data at hand. Further, if an associated network based on other data is used as the prior, the method often fails to utilize the information effectively. RESULTS: We propose a novel graphical lasso approach, the tailored graphical lasso, that aims to handle prior information of unknown accuracy more effectively. We provide an R package implementing the method, tailoredGlasso. Applying the method to both simulated and real multiomic data sets, we find that it outperforms the unweighted and weighted graphical lasso in terms of all performance measures we consider. In fact, the graphical lasso and weighted graphical lasso can be considered special cases of the tailored graphical lasso, and a parameter determined by the data measures the usefulness of the prior information. We also find that among a larger set of methods, the tailored graphical is the most suitable for network inference from high-dimensional data with prior information of unknown accuracy. With our method, mRNA data are demonstrated to provide highly useful prior information for protein–protein interaction networks. CONCLUSIONS: The method we introduce utilizes useful prior information more effectively without involving any risk of loss of accuracy should the prior information be misleading.
format Online
Article
Text
id pubmed-8518261
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-85182612021-10-20 Tailored graphical lasso for data integration in gene network reconstruction Lingjærde, Camilla Lien, Tonje G. Borgan, Ørnulf Bergholtz, Helga Glad, Ingrid K. BMC Bioinformatics Methodology Article BACKGROUND: Identifying gene interactions is a topic of great importance in genomics, and approaches based on network models provide a powerful tool for studying these. Assuming a Gaussian graphical model, a gene association network may be estimated from multiomic data based on the non-zero entries of the inverse covariance matrix. Inferring such biological networks is challenging because of the high dimensionality of the problem, making traditional estimators unsuitable. The graphical lasso is constructed for the estimation of sparse inverse covariance matrices in such situations, using [Formula: see text] -penalization on the matrix entries. The weighted graphical lasso is an extension in which prior biological information from other sources is integrated into the model. There are however issues with this approach, as it naïvely forces the prior information into the network estimation, even if it is misleading or does not agree with the data at hand. Further, if an associated network based on other data is used as the prior, the method often fails to utilize the information effectively. RESULTS: We propose a novel graphical lasso approach, the tailored graphical lasso, that aims to handle prior information of unknown accuracy more effectively. We provide an R package implementing the method, tailoredGlasso. Applying the method to both simulated and real multiomic data sets, we find that it outperforms the unweighted and weighted graphical lasso in terms of all performance measures we consider. In fact, the graphical lasso and weighted graphical lasso can be considered special cases of the tailored graphical lasso, and a parameter determined by the data measures the usefulness of the prior information. We also find that among a larger set of methods, the tailored graphical is the most suitable for network inference from high-dimensional data with prior information of unknown accuracy. With our method, mRNA data are demonstrated to provide highly useful prior information for protein–protein interaction networks. CONCLUSIONS: The method we introduce utilizes useful prior information more effectively without involving any risk of loss of accuracy should the prior information be misleading. BioMed Central 2021-10-15 /pmc/articles/PMC8518261/ /pubmed/34654363 http://dx.doi.org/10.1186/s12859-021-04413-z Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Lingjærde, Camilla
Lien, Tonje G.
Borgan, Ørnulf
Bergholtz, Helga
Glad, Ingrid K.
Tailored graphical lasso for data integration in gene network reconstruction
title Tailored graphical lasso for data integration in gene network reconstruction
title_full Tailored graphical lasso for data integration in gene network reconstruction
title_fullStr Tailored graphical lasso for data integration in gene network reconstruction
title_full_unstemmed Tailored graphical lasso for data integration in gene network reconstruction
title_short Tailored graphical lasso for data integration in gene network reconstruction
title_sort tailored graphical lasso for data integration in gene network reconstruction
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8518261/
https://www.ncbi.nlm.nih.gov/pubmed/34654363
http://dx.doi.org/10.1186/s12859-021-04413-z
work_keys_str_mv AT lingjærdecamilla tailoredgraphicallassofordataintegrationingenenetworkreconstruction
AT lientonjeg tailoredgraphicallassofordataintegrationingenenetworkreconstruction
AT borganørnulf tailoredgraphicallassofordataintegrationingenenetworkreconstruction
AT bergholtzhelga tailoredgraphicallassofordataintegrationingenenetworkreconstruction
AT gladingridk tailoredgraphicallassofordataintegrationingenenetworkreconstruction