Cargando…
Gene network inference by fusing data from diverse distributions
Motivation: Markov networks are undirected graphical models that are widely used to infer relations between genes from experimental data. Their state-of-the-art inference procedures assume the data arise from a Gaussian distribution. High-throughput omics data, such as that from next generation sequ...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4542780/ https://www.ncbi.nlm.nih.gov/pubmed/26072487 http://dx.doi.org/10.1093/bioinformatics/btv258 |
_version_ | 1782386561568997376 |
---|---|
author | Žitnik, Marinka Zupan, Blaž |
author_facet | Žitnik, Marinka Zupan, Blaž |
author_sort | Žitnik, Marinka |
collection | PubMed |
description | Motivation: Markov networks are undirected graphical models that are widely used to infer relations between genes from experimental data. Their state-of-the-art inference procedures assume the data arise from a Gaussian distribution. High-throughput omics data, such as that from next generation sequencing, often violates this assumption. Furthermore, when collected data arise from multiple related but otherwise nonidentical distributions, their underlying networks are likely to have common features. New principled statistical approaches are needed that can deal with different data distributions and jointly consider collections of datasets. Results: We present FuseNet, a Markov network formulation that infers networks from a collection of nonidentically distributed datasets. Our approach is computationally efficient and general: given any number of distributions from an exponential family, FuseNet represents model parameters through shared latent factors that define neighborhoods of network nodes. In a simulation study, we demonstrate good predictive performance of FuseNet in comparison to several popular graphical models. We show its effectiveness in an application to breast cancer RNA-sequencing and somatic mutation data, a novel application of graphical models. Fusion of datasets offers substantial gains relative to inference of separate networks for each dataset. Our results demonstrate that network inference methods for non-Gaussian data can help in accurate modeling of the data generated by emergent high-throughput technologies. Availability and implementation: Source code is at https://github.com/marinkaz/fusenet. Contact: blaz.zupan@fri.uni-lj.si Supplementary information: Supplementary information is available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-4542780 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-45427802015-08-25 Gene network inference by fusing data from diverse distributions Žitnik, Marinka Zupan, Blaž Bioinformatics Ismb/Eccb 2015 Proceedings Papers Committee July 10 to July 14, 2015, Dublin, Ireland Motivation: Markov networks are undirected graphical models that are widely used to infer relations between genes from experimental data. Their state-of-the-art inference procedures assume the data arise from a Gaussian distribution. High-throughput omics data, such as that from next generation sequencing, often violates this assumption. Furthermore, when collected data arise from multiple related but otherwise nonidentical distributions, their underlying networks are likely to have common features. New principled statistical approaches are needed that can deal with different data distributions and jointly consider collections of datasets. Results: We present FuseNet, a Markov network formulation that infers networks from a collection of nonidentically distributed datasets. Our approach is computationally efficient and general: given any number of distributions from an exponential family, FuseNet represents model parameters through shared latent factors that define neighborhoods of network nodes. In a simulation study, we demonstrate good predictive performance of FuseNet in comparison to several popular graphical models. We show its effectiveness in an application to breast cancer RNA-sequencing and somatic mutation data, a novel application of graphical models. Fusion of datasets offers substantial gains relative to inference of separate networks for each dataset. Our results demonstrate that network inference methods for non-Gaussian data can help in accurate modeling of the data generated by emergent high-throughput technologies. Availability and implementation: Source code is at https://github.com/marinkaz/fusenet. Contact: blaz.zupan@fri.uni-lj.si Supplementary information: Supplementary information is available at Bioinformatics online. Oxford University Press 2015-06-15 2015-06-10 /pmc/articles/PMC4542780/ /pubmed/26072487 http://dx.doi.org/10.1093/bioinformatics/btv258 Text en © The Author 2015. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License(http://creativecommons.org/licenses/by-nc/3.0/),which permits non-commercial reuse, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Ismb/Eccb 2015 Proceedings Papers Committee July 10 to July 14, 2015, Dublin, Ireland Žitnik, Marinka Zupan, Blaž Gene network inference by fusing data from diverse distributions |
title | Gene network inference by fusing data from diverse distributions |
title_full | Gene network inference by fusing data from diverse distributions |
title_fullStr | Gene network inference by fusing data from diverse distributions |
title_full_unstemmed | Gene network inference by fusing data from diverse distributions |
title_short | Gene network inference by fusing data from diverse distributions |
title_sort | gene network inference by fusing data from diverse distributions |
topic | Ismb/Eccb 2015 Proceedings Papers Committee July 10 to July 14, 2015, Dublin, Ireland |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4542780/ https://www.ncbi.nlm.nih.gov/pubmed/26072487 http://dx.doi.org/10.1093/bioinformatics/btv258 |
work_keys_str_mv | AT zitnikmarinka genenetworkinferencebyfusingdatafromdiversedistributions AT zupanblaz genenetworkinferencebyfusingdatafromdiversedistributions |