Cargando…

Learning causal networks with latent variables from multivariate information in genomic data

Learning causal networks from large-scale genomic data remains challenging in absence of time series or controlled perturbation experiments. We report an information- theoretic method which learns a large class of causal or non-causal graphical models from purely observational data, while including...

Descripción completa

Detalles Bibliográficos
Autores principales: Verny, Louis, Sella, Nadir, Affeldt, Séverine, Singh, Param Priya, Isambert, Hervé
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5685645/
https://www.ncbi.nlm.nih.gov/pubmed/28968390
http://dx.doi.org/10.1371/journal.pcbi.1005662
_version_ 1783278661479170048
author Verny, Louis
Sella, Nadir
Affeldt, Séverine
Singh, Param Priya
Isambert, Hervé
author_facet Verny, Louis
Sella, Nadir
Affeldt, Séverine
Singh, Param Priya
Isambert, Hervé
author_sort Verny, Louis
collection PubMed
description Learning causal networks from large-scale genomic data remains challenging in absence of time series or controlled perturbation experiments. We report an information- theoretic method which learns a large class of causal or non-causal graphical models from purely observational data, while including the effects of unobserved latent variables, commonly found in many genomic datasets. Starting from a complete graph, the method iteratively removes dispensable edges, by uncovering significant information contributions from indirect paths, and assesses edge-specific confidences from randomization of available data. The remaining edges are then oriented based on the signature of causality in observational data. The approach and associated algorithm, miic, outperform earlier methods on a broad range of benchmark networks. Causal network reconstructions are presented at different biological size and time scales, from gene regulation in single cells to whole genome duplication in tumor development as well as long term evolution of vertebrates. Miic is publicly available at https://github.com/miicTeam/MIIC.
format Online
Article
Text
id pubmed-5685645
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-56856452017-11-30 Learning causal networks with latent variables from multivariate information in genomic data Verny, Louis Sella, Nadir Affeldt, Séverine Singh, Param Priya Isambert, Hervé PLoS Comput Biol Research Article Learning causal networks from large-scale genomic data remains challenging in absence of time series or controlled perturbation experiments. We report an information- theoretic method which learns a large class of causal or non-causal graphical models from purely observational data, while including the effects of unobserved latent variables, commonly found in many genomic datasets. Starting from a complete graph, the method iteratively removes dispensable edges, by uncovering significant information contributions from indirect paths, and assesses edge-specific confidences from randomization of available data. The remaining edges are then oriented based on the signature of causality in observational data. The approach and associated algorithm, miic, outperform earlier methods on a broad range of benchmark networks. Causal network reconstructions are presented at different biological size and time scales, from gene regulation in single cells to whole genome duplication in tumor development as well as long term evolution of vertebrates. Miic is publicly available at https://github.com/miicTeam/MIIC. Public Library of Science 2017-10-02 /pmc/articles/PMC5685645/ /pubmed/28968390 http://dx.doi.org/10.1371/journal.pcbi.1005662 Text en © 2017 Verny et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Verny, Louis
Sella, Nadir
Affeldt, Séverine
Singh, Param Priya
Isambert, Hervé
Learning causal networks with latent variables from multivariate information in genomic data
title Learning causal networks with latent variables from multivariate information in genomic data
title_full Learning causal networks with latent variables from multivariate information in genomic data
title_fullStr Learning causal networks with latent variables from multivariate information in genomic data
title_full_unstemmed Learning causal networks with latent variables from multivariate information in genomic data
title_short Learning causal networks with latent variables from multivariate information in genomic data
title_sort learning causal networks with latent variables from multivariate information in genomic data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5685645/
https://www.ncbi.nlm.nih.gov/pubmed/28968390
http://dx.doi.org/10.1371/journal.pcbi.1005662
work_keys_str_mv AT vernylouis learningcausalnetworkswithlatentvariablesfrommultivariateinformationingenomicdata
AT sellanadir learningcausalnetworkswithlatentvariablesfrommultivariateinformationingenomicdata
AT affeldtseverine learningcausalnetworkswithlatentvariablesfrommultivariateinformationingenomicdata
AT singhparampriya learningcausalnetworkswithlatentvariablesfrommultivariateinformationingenomicdata
AT isambertherve learningcausalnetworkswithlatentvariablesfrommultivariateinformationingenomicdata