Cargando…

3off2: A network reconstruction algorithm based on 2-point and 3-point information statistics

BACKGROUND: The reconstruction of reliable graphical models from observational data is important in bioinformatics and other computational fields applying network reconstruction methods to large, yet finite datasets. The main network reconstruction approaches are either based on Bayesian scores, whi...

Descripción completa

Detalles Bibliográficos
Autores principales: Affeldt, Séverine, Verny, Louis, Isambert, Hervé
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4959376/
https://www.ncbi.nlm.nih.gov/pubmed/26823190
http://dx.doi.org/10.1186/s12859-015-0856-x
_version_ 1782444394126770176
author Affeldt, Séverine
Verny, Louis
Isambert, Hervé
author_facet Affeldt, Séverine
Verny, Louis
Isambert, Hervé
author_sort Affeldt, Séverine
collection PubMed
description BACKGROUND: The reconstruction of reliable graphical models from observational data is important in bioinformatics and other computational fields applying network reconstruction methods to large, yet finite datasets. The main network reconstruction approaches are either based on Bayesian scores, which enable the ranking of alternative Bayesian networks, or rely on the identification of structural independencies, which correspond to missing edges in the underlying network. Bayesian inference methods typically require heuristic search strategies, such as hill-climbing algorithms, to sample the super-exponential space of possible networks. By contrast, constraint-based methods, such as the PC and IC algorithms, are expected to run in polynomial time on sparse underlying graphs, provided that a correct list of conditional independencies is available. Yet, in practice, conditional independencies need to be ascertained from the available observational data, based on adjustable statistical significance levels, and are not robust to sampling noise from finite datasets. RESULTS: We propose a more robust approach to reconstruct graphical models from finite datasets. It combines constraint-based and Bayesian approaches to infer structural independencies based on the ranking of their most likely contributing nodes. In a nutshell, this local optimization scheme and corresponding 3off2 algorithm iteratively “take off” the most likely conditional 3-point information from the 2-point (mutual) information between each pair of nodes. Conditional independencies are thus derived by progressively collecting the most significant indirect contributions to all pairwise mutual information. The resulting network skeleton is then partially directed by orienting and propagating edge directions, based on the sign and magnitude of the conditional 3-point information of unshielded triples. The approach is shown to outperform both constraint-based and Bayesian inference methods on a range of benchmark networks. The 3off2 approach is then applied to the reconstruction of the hematopoiesis regulation network based on recent single cell expression data and is found to retrieve more experimentally ascertained regulations between transcription factors than with other available methods. CONCLUSIONS: The novel information-theoretic approach and corresponding 3off2 algorithm combine constraint-based and Bayesian inference methods to reliably reconstruct graphical models, despite inherent sampling noise in finite datasets. In particular, experimentally verified interactions as well as novel predicted regulations are established on the hematopoiesis regulatory networks based on single cell expression data. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0856-x) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4959376
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-49593762016-08-01 3off2: A network reconstruction algorithm based on 2-point and 3-point information statistics Affeldt, Séverine Verny, Louis Isambert, Hervé BMC Bioinformatics Proceedings BACKGROUND: The reconstruction of reliable graphical models from observational data is important in bioinformatics and other computational fields applying network reconstruction methods to large, yet finite datasets. The main network reconstruction approaches are either based on Bayesian scores, which enable the ranking of alternative Bayesian networks, or rely on the identification of structural independencies, which correspond to missing edges in the underlying network. Bayesian inference methods typically require heuristic search strategies, such as hill-climbing algorithms, to sample the super-exponential space of possible networks. By contrast, constraint-based methods, such as the PC and IC algorithms, are expected to run in polynomial time on sparse underlying graphs, provided that a correct list of conditional independencies is available. Yet, in practice, conditional independencies need to be ascertained from the available observational data, based on adjustable statistical significance levels, and are not robust to sampling noise from finite datasets. RESULTS: We propose a more robust approach to reconstruct graphical models from finite datasets. It combines constraint-based and Bayesian approaches to infer structural independencies based on the ranking of their most likely contributing nodes. In a nutshell, this local optimization scheme and corresponding 3off2 algorithm iteratively “take off” the most likely conditional 3-point information from the 2-point (mutual) information between each pair of nodes. Conditional independencies are thus derived by progressively collecting the most significant indirect contributions to all pairwise mutual information. The resulting network skeleton is then partially directed by orienting and propagating edge directions, based on the sign and magnitude of the conditional 3-point information of unshielded triples. The approach is shown to outperform both constraint-based and Bayesian inference methods on a range of benchmark networks. The 3off2 approach is then applied to the reconstruction of the hematopoiesis regulation network based on recent single cell expression data and is found to retrieve more experimentally ascertained regulations between transcription factors than with other available methods. CONCLUSIONS: The novel information-theoretic approach and corresponding 3off2 algorithm combine constraint-based and Bayesian inference methods to reliably reconstruct graphical models, despite inherent sampling noise in finite datasets. In particular, experimentally verified interactions as well as novel predicted regulations are established on the hematopoiesis regulatory networks based on single cell expression data. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0856-x) contains supplementary material, which is available to authorized users. BioMed Central 2016-01-20 /pmc/articles/PMC4959376/ /pubmed/26823190 http://dx.doi.org/10.1186/s12859-015-0856-x Text en © Affeldt et al. 2015 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Proceedings
Affeldt, Séverine
Verny, Louis
Isambert, Hervé
3off2: A network reconstruction algorithm based on 2-point and 3-point information statistics
title 3off2: A network reconstruction algorithm based on 2-point and 3-point information statistics
title_full 3off2: A network reconstruction algorithm based on 2-point and 3-point information statistics
title_fullStr 3off2: A network reconstruction algorithm based on 2-point and 3-point information statistics
title_full_unstemmed 3off2: A network reconstruction algorithm based on 2-point and 3-point information statistics
title_short 3off2: A network reconstruction algorithm based on 2-point and 3-point information statistics
title_sort 3off2: a network reconstruction algorithm based on 2-point and 3-point information statistics
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4959376/
https://www.ncbi.nlm.nih.gov/pubmed/26823190
http://dx.doi.org/10.1186/s12859-015-0856-x
work_keys_str_mv AT affeldtseverine 3off2anetworkreconstructionalgorithmbasedon2pointand3pointinformationstatistics
AT vernylouis 3off2anetworkreconstructionalgorithmbasedon2pointand3pointinformationstatistics
AT isambertherve 3off2anetworkreconstructionalgorithmbasedon2pointand3pointinformationstatistics