Cargando…

Connectivity problems on heterogeneous graphs

BACKGROUND: Network connectivity problems are abundant in computational biology research, where graphs are used to represent a range of phenomena: from physical interactions between molecules to more abstract relationships such as gene co-expression. One common challenge in studying biological netwo...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Jimmy, Khodaverdian, Alex, Weitz, Benjamin, Yosef, Nir
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6408827/
https://www.ncbi.nlm.nih.gov/pubmed/30899321
http://dx.doi.org/10.1186/s13015-019-0141-z
_version_ 1783401860623761408
author Wu, Jimmy
Khodaverdian, Alex
Weitz, Benjamin
Yosef, Nir
author_facet Wu, Jimmy
Khodaverdian, Alex
Weitz, Benjamin
Yosef, Nir
author_sort Wu, Jimmy
collection PubMed
description BACKGROUND: Network connectivity problems are abundant in computational biology research, where graphs are used to represent a range of phenomena: from physical interactions between molecules to more abstract relationships such as gene co-expression. One common challenge in studying biological networks is the need to extract meaningful, small subgraphs out of large databases of potential interactions. A useful abstraction for this task turned out to be the Steiner Network problems: given a reference “database” graph, find a parsimonious subgraph that satisfies a given set of connectivity demands. While this formulation proved useful in a number of instances, the next challenge is to account for the fact that the reference graph may not be static. This can happen for instance, when studying protein measurements in single cells or at different time points, whereby different subsets of conditions can have different protein milieu. RESULTS AND DISCUSSION: We introduce the condition Steiner Network problem in which we concomitantly consider a set of distinct biological conditions. Each condition is associated with a set of connectivity demands, as well as a set of edges that are assumed to be present in that condition. The goal of this problem is to find a minimal subgraph that satisfies all the demands through paths that are present in the respective condition. We show that introducing multiple conditions as an additional factor makes this problem much harder to approximate. Specifically, we prove that for C conditions, this new problem is NP-hard to approximate to a factor of [Formula: see text] , for every [Formula: see text] and [Formula: see text] , and that this bound is tight. Moving beyond the worst case, we explore a special set of instances where the reference graph grows monotonically between conditions, and show that this problem admits substantially improved approximation algorithms. We also developed an integer linear programming solver for the general problem and demonstrate its ability to reach optimality with instances from the human protein interaction network. CONCLUSION: Our results demonstrate that in contrast to most connectivity problems studied in computational biology, accounting for multiplicity of biological conditions adds considerable complexity, which we propose to address with a new solver. Importantly, our results extend to several network connectivity problems that are commonly used in computational biology, such as Prize-Collecting Steiner Tree, and provide insight into the theoretical guarantees for their applications in a multiple condition setting.
format Online
Article
Text
id pubmed-6408827
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-64088272019-03-21 Connectivity problems on heterogeneous graphs Wu, Jimmy Khodaverdian, Alex Weitz, Benjamin Yosef, Nir Algorithms Mol Biol Research BACKGROUND: Network connectivity problems are abundant in computational biology research, where graphs are used to represent a range of phenomena: from physical interactions between molecules to more abstract relationships such as gene co-expression. One common challenge in studying biological networks is the need to extract meaningful, small subgraphs out of large databases of potential interactions. A useful abstraction for this task turned out to be the Steiner Network problems: given a reference “database” graph, find a parsimonious subgraph that satisfies a given set of connectivity demands. While this formulation proved useful in a number of instances, the next challenge is to account for the fact that the reference graph may not be static. This can happen for instance, when studying protein measurements in single cells or at different time points, whereby different subsets of conditions can have different protein milieu. RESULTS AND DISCUSSION: We introduce the condition Steiner Network problem in which we concomitantly consider a set of distinct biological conditions. Each condition is associated with a set of connectivity demands, as well as a set of edges that are assumed to be present in that condition. The goal of this problem is to find a minimal subgraph that satisfies all the demands through paths that are present in the respective condition. We show that introducing multiple conditions as an additional factor makes this problem much harder to approximate. Specifically, we prove that for C conditions, this new problem is NP-hard to approximate to a factor of [Formula: see text] , for every [Formula: see text] and [Formula: see text] , and that this bound is tight. Moving beyond the worst case, we explore a special set of instances where the reference graph grows monotonically between conditions, and show that this problem admits substantially improved approximation algorithms. We also developed an integer linear programming solver for the general problem and demonstrate its ability to reach optimality with instances from the human protein interaction network. CONCLUSION: Our results demonstrate that in contrast to most connectivity problems studied in computational biology, accounting for multiplicity of biological conditions adds considerable complexity, which we propose to address with a new solver. Importantly, our results extend to several network connectivity problems that are commonly used in computational biology, such as Prize-Collecting Steiner Tree, and provide insight into the theoretical guarantees for their applications in a multiple condition setting. BioMed Central 2019-03-08 /pmc/articles/PMC6408827/ /pubmed/30899321 http://dx.doi.org/10.1186/s13015-019-0141-z Text en © The Author(s) 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Wu, Jimmy
Khodaverdian, Alex
Weitz, Benjamin
Yosef, Nir
Connectivity problems on heterogeneous graphs
title Connectivity problems on heterogeneous graphs
title_full Connectivity problems on heterogeneous graphs
title_fullStr Connectivity problems on heterogeneous graphs
title_full_unstemmed Connectivity problems on heterogeneous graphs
title_short Connectivity problems on heterogeneous graphs
title_sort connectivity problems on heterogeneous graphs
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6408827/
https://www.ncbi.nlm.nih.gov/pubmed/30899321
http://dx.doi.org/10.1186/s13015-019-0141-z
work_keys_str_mv AT wujimmy connectivityproblemsonheterogeneousgraphs
AT khodaverdianalex connectivityproblemsonheterogeneousgraphs
AT weitzbenjamin connectivityproblemsonheterogeneousgraphs
AT yosefnir connectivityproblemsonheterogeneousgraphs