Cargando…
Multi-study inference of regulatory networks for more accurate models of gene regulation
Gene regulatory networks are composed of sub-networks that are often shared across biological processes, cell-types, and organisms. Leveraging multiple sources of information, such as publicly available gene expression datasets, could therefore be helpful when learning a network of interest. Integra...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6363223/ https://www.ncbi.nlm.nih.gov/pubmed/30677040 http://dx.doi.org/10.1371/journal.pcbi.1006591 |
_version_ | 1783393067241308160 |
---|---|
author | Castro, Dayanne M. de Veaux, Nicholas R. Miraldi, Emily R. Bonneau, Richard |
author_facet | Castro, Dayanne M. de Veaux, Nicholas R. Miraldi, Emily R. Bonneau, Richard |
author_sort | Castro, Dayanne M. |
collection | PubMed |
description | Gene regulatory networks are composed of sub-networks that are often shared across biological processes, cell-types, and organisms. Leveraging multiple sources of information, such as publicly available gene expression datasets, could therefore be helpful when learning a network of interest. Integrating data across different studies, however, raises numerous technical concerns. Hence, a common approach in network inference, and broadly in genomics research, is to separately learn models from each dataset and combine the results. Individual models, however, often suffer from under-sampling, poor generalization and limited network recovery. In this study, we explore previous integration strategies, such as batch-correction and model ensembles, and introduce a new multitask learning approach for joint network inference across several datasets. Our method initially estimates the activities of transcription factors, and subsequently, infers the relevant network topology. As regulatory interactions are context-dependent, we estimate model coefficients as a combination of both dataset-specific and conserved components. In addition, adaptive penalties may be used to favor models that include interactions derived from multiple sources of prior knowledge including orthogonal genomics experiments. We evaluate generalization and network recovery using examples from Bacillus subtilis and Saccharomyces cerevisiae, and show that sharing information across models improves network reconstruction. Finally, we demonstrate robustness to both false positives in the prior information and heterogeneity among datasets. |
format | Online Article Text |
id | pubmed-6363223 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-63632232019-02-15 Multi-study inference of regulatory networks for more accurate models of gene regulation Castro, Dayanne M. de Veaux, Nicholas R. Miraldi, Emily R. Bonneau, Richard PLoS Comput Biol Research Article Gene regulatory networks are composed of sub-networks that are often shared across biological processes, cell-types, and organisms. Leveraging multiple sources of information, such as publicly available gene expression datasets, could therefore be helpful when learning a network of interest. Integrating data across different studies, however, raises numerous technical concerns. Hence, a common approach in network inference, and broadly in genomics research, is to separately learn models from each dataset and combine the results. Individual models, however, often suffer from under-sampling, poor generalization and limited network recovery. In this study, we explore previous integration strategies, such as batch-correction and model ensembles, and introduce a new multitask learning approach for joint network inference across several datasets. Our method initially estimates the activities of transcription factors, and subsequently, infers the relevant network topology. As regulatory interactions are context-dependent, we estimate model coefficients as a combination of both dataset-specific and conserved components. In addition, adaptive penalties may be used to favor models that include interactions derived from multiple sources of prior knowledge including orthogonal genomics experiments. We evaluate generalization and network recovery using examples from Bacillus subtilis and Saccharomyces cerevisiae, and show that sharing information across models improves network reconstruction. Finally, we demonstrate robustness to both false positives in the prior information and heterogeneity among datasets. Public Library of Science 2019-01-24 /pmc/articles/PMC6363223/ /pubmed/30677040 http://dx.doi.org/10.1371/journal.pcbi.1006591 Text en © 2019 Castro et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Castro, Dayanne M. de Veaux, Nicholas R. Miraldi, Emily R. Bonneau, Richard Multi-study inference of regulatory networks for more accurate models of gene regulation |
title | Multi-study inference of regulatory networks for more accurate models of gene regulation |
title_full | Multi-study inference of regulatory networks for more accurate models of gene regulation |
title_fullStr | Multi-study inference of regulatory networks for more accurate models of gene regulation |
title_full_unstemmed | Multi-study inference of regulatory networks for more accurate models of gene regulation |
title_short | Multi-study inference of regulatory networks for more accurate models of gene regulation |
title_sort | multi-study inference of regulatory networks for more accurate models of gene regulation |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6363223/ https://www.ncbi.nlm.nih.gov/pubmed/30677040 http://dx.doi.org/10.1371/journal.pcbi.1006591 |
work_keys_str_mv | AT castrodayannem multistudyinferenceofregulatorynetworksformoreaccuratemodelsofgeneregulation AT deveauxnicholasr multistudyinferenceofregulatorynetworksformoreaccuratemodelsofgeneregulation AT miraldiemilyr multistudyinferenceofregulatorynetworksformoreaccuratemodelsofgeneregulation AT bonneaurichard multistudyinferenceofregulatorynetworksformoreaccuratemodelsofgeneregulation |