Cargando…

Process-Driven Inference of Biological Network Structure: Feasibility, Minimality, and Multiplicity

A common problem in molecular biology is to use experimental data, such as microarray data, to infer knowledge about the structure of interactions between important molecules in subsystems of the cell. By approximating the state of each molecule as “on” or “off”, it becomes possible to simplify the...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Guanyu, Rong, Yongwu, Chen, Hao, Pearson, Carl, Du, Chenghang, Simha, Rahul, Zeng, Chen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3399897/
https://www.ncbi.nlm.nih.gov/pubmed/22815739
http://dx.doi.org/10.1371/journal.pone.0040330
Descripción
Sumario:A common problem in molecular biology is to use experimental data, such as microarray data, to infer knowledge about the structure of interactions between important molecules in subsystems of the cell. By approximating the state of each molecule as “on” or “off”, it becomes possible to simplify the problem, and exploit the tools of Boolean analysis for such inference. Amongst Boolean techniques, the process-driven approach has shown promise in being able to identify putative network structures, as well as stability and modularity properties. This paper examines the process-driven approach more formally, and makes four contributions about the computational complexity of the inference problem, under the “dominant inhibition” assumption of molecular interactions. The first is a proof that the feasibility problem (does there exist a network that explains the data?) can be solved in polynomial-time. Second, the minimality problem (what is the smallest network that explains the data?) is shown to be NP-hard, and therefore unlikely to result in a polynomial-time algorithm. Third, a simple polynomial-time heuristic is shown to produce near-minimal solutions, as demonstrated by simulation. Fourth, the theoretical framework explains how multiplicity (the number of network solutions to realize a given biological process), which can take exponential-time to compute, can instead be accurately estimated by a fast, polynomial-time heuristic.