Cargando…
3off2: A network reconstruction algorithm based on 2-point and 3-point information statistics
BACKGROUND: The reconstruction of reliable graphical models from observational data is important in bioinformatics and other computational fields applying network reconstruction methods to large, yet finite datasets. The main network reconstruction approaches are either based on Bayesian scores, whi...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4959376/ https://www.ncbi.nlm.nih.gov/pubmed/26823190 http://dx.doi.org/10.1186/s12859-015-0856-x |
_version_ | 1782444394126770176 |
---|---|
author | Affeldt, Séverine Verny, Louis Isambert, Hervé |
author_facet | Affeldt, Séverine Verny, Louis Isambert, Hervé |
author_sort | Affeldt, Séverine |
collection | PubMed |
description | BACKGROUND: The reconstruction of reliable graphical models from observational data is important in bioinformatics and other computational fields applying network reconstruction methods to large, yet finite datasets. The main network reconstruction approaches are either based on Bayesian scores, which enable the ranking of alternative Bayesian networks, or rely on the identification of structural independencies, which correspond to missing edges in the underlying network. Bayesian inference methods typically require heuristic search strategies, such as hill-climbing algorithms, to sample the super-exponential space of possible networks. By contrast, constraint-based methods, such as the PC and IC algorithms, are expected to run in polynomial time on sparse underlying graphs, provided that a correct list of conditional independencies is available. Yet, in practice, conditional independencies need to be ascertained from the available observational data, based on adjustable statistical significance levels, and are not robust to sampling noise from finite datasets. RESULTS: We propose a more robust approach to reconstruct graphical models from finite datasets. It combines constraint-based and Bayesian approaches to infer structural independencies based on the ranking of their most likely contributing nodes. In a nutshell, this local optimization scheme and corresponding 3off2 algorithm iteratively “take off” the most likely conditional 3-point information from the 2-point (mutual) information between each pair of nodes. Conditional independencies are thus derived by progressively collecting the most significant indirect contributions to all pairwise mutual information. The resulting network skeleton is then partially directed by orienting and propagating edge directions, based on the sign and magnitude of the conditional 3-point information of unshielded triples. The approach is shown to outperform both constraint-based and Bayesian inference methods on a range of benchmark networks. The 3off2 approach is then applied to the reconstruction of the hematopoiesis regulation network based on recent single cell expression data and is found to retrieve more experimentally ascertained regulations between transcription factors than with other available methods. CONCLUSIONS: The novel information-theoretic approach and corresponding 3off2 algorithm combine constraint-based and Bayesian inference methods to reliably reconstruct graphical models, despite inherent sampling noise in finite datasets. In particular, experimentally verified interactions as well as novel predicted regulations are established on the hematopoiesis regulatory networks based on single cell expression data. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0856-x) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4959376 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-49593762016-08-01 3off2: A network reconstruction algorithm based on 2-point and 3-point information statistics Affeldt, Séverine Verny, Louis Isambert, Hervé BMC Bioinformatics Proceedings BACKGROUND: The reconstruction of reliable graphical models from observational data is important in bioinformatics and other computational fields applying network reconstruction methods to large, yet finite datasets. The main network reconstruction approaches are either based on Bayesian scores, which enable the ranking of alternative Bayesian networks, or rely on the identification of structural independencies, which correspond to missing edges in the underlying network. Bayesian inference methods typically require heuristic search strategies, such as hill-climbing algorithms, to sample the super-exponential space of possible networks. By contrast, constraint-based methods, such as the PC and IC algorithms, are expected to run in polynomial time on sparse underlying graphs, provided that a correct list of conditional independencies is available. Yet, in practice, conditional independencies need to be ascertained from the available observational data, based on adjustable statistical significance levels, and are not robust to sampling noise from finite datasets. RESULTS: We propose a more robust approach to reconstruct graphical models from finite datasets. It combines constraint-based and Bayesian approaches to infer structural independencies based on the ranking of their most likely contributing nodes. In a nutshell, this local optimization scheme and corresponding 3off2 algorithm iteratively “take off” the most likely conditional 3-point information from the 2-point (mutual) information between each pair of nodes. Conditional independencies are thus derived by progressively collecting the most significant indirect contributions to all pairwise mutual information. The resulting network skeleton is then partially directed by orienting and propagating edge directions, based on the sign and magnitude of the conditional 3-point information of unshielded triples. The approach is shown to outperform both constraint-based and Bayesian inference methods on a range of benchmark networks. The 3off2 approach is then applied to the reconstruction of the hematopoiesis regulation network based on recent single cell expression data and is found to retrieve more experimentally ascertained regulations between transcription factors than with other available methods. CONCLUSIONS: The novel information-theoretic approach and corresponding 3off2 algorithm combine constraint-based and Bayesian inference methods to reliably reconstruct graphical models, despite inherent sampling noise in finite datasets. In particular, experimentally verified interactions as well as novel predicted regulations are established on the hematopoiesis regulatory networks based on single cell expression data. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0856-x) contains supplementary material, which is available to authorized users. BioMed Central 2016-01-20 /pmc/articles/PMC4959376/ /pubmed/26823190 http://dx.doi.org/10.1186/s12859-015-0856-x Text en © Affeldt et al. 2015 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Proceedings Affeldt, Séverine Verny, Louis Isambert, Hervé 3off2: A network reconstruction algorithm based on 2-point and 3-point information statistics |
title | 3off2: A network reconstruction algorithm based on 2-point and 3-point information statistics |
title_full | 3off2: A network reconstruction algorithm based on 2-point and 3-point information statistics |
title_fullStr | 3off2: A network reconstruction algorithm based on 2-point and 3-point information statistics |
title_full_unstemmed | 3off2: A network reconstruction algorithm based on 2-point and 3-point information statistics |
title_short | 3off2: A network reconstruction algorithm based on 2-point and 3-point information statistics |
title_sort | 3off2: a network reconstruction algorithm based on 2-point and 3-point information statistics |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4959376/ https://www.ncbi.nlm.nih.gov/pubmed/26823190 http://dx.doi.org/10.1186/s12859-015-0856-x |
work_keys_str_mv | AT affeldtseverine 3off2anetworkreconstructionalgorithmbasedon2pointand3pointinformationstatistics AT vernylouis 3off2anetworkreconstructionalgorithmbasedon2pointand3pointinformationstatistics AT isambertherve 3off2anetworkreconstructionalgorithmbasedon2pointand3pointinformationstatistics |