Cargando…

Integration of molecular network data reconstructs Gene Ontology

Motivation: Recently, a shift was made from using Gene Ontology (GO) to evaluate molecular network data to using these data to construct and evaluate GO. Dutkowski et al. provide the first evidence that a large part of GO can be reconstructed solely from topologies of molecular networks. Motivated b...

Descripción completa

Detalles Bibliográficos
Autores principales: Gligorijević, Vladimir, Janjić, Vuk, Pržulj, Nataša
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4230235/
https://www.ncbi.nlm.nih.gov/pubmed/25161252
http://dx.doi.org/10.1093/bioinformatics/btu470
_version_ 1782344233344040960
author Gligorijević, Vladimir
Janjić, Vuk
Pržulj, Nataša
author_facet Gligorijević, Vladimir
Janjić, Vuk
Pržulj, Nataša
author_sort Gligorijević, Vladimir
collection PubMed
description Motivation: Recently, a shift was made from using Gene Ontology (GO) to evaluate molecular network data to using these data to construct and evaluate GO. Dutkowski et al. provide the first evidence that a large part of GO can be reconstructed solely from topologies of molecular networks. Motivated by this work, we develop a novel data integration framework that integrates multiple types of molecular network data to reconstruct and update GO. We ask how much of GO can be recovered by integrating various molecular interaction data. Results: We introduce a computational framework for integration of various biological networks using penalized non-negative matrix tri-factorization (PNMTF). It takes all network data in a matrix form and performs simultaneous clustering of genes and GO terms, inducing new relations between genes and GO terms (annotations) and between GO terms themselves. To improve the accuracy of our predicted relations, we extend the integration methodology to include additional topological information represented as the similarity in wiring around non-interacting genes. Surprisingly, by integrating topologies of bakers’ yeasts protein–protein interaction, genetic interaction (GI) and co-expression networks, our method reports as related 96% of GO terms that are directly related in GO. The inclusion of the wiring similarity of non-interacting genes contributes 6% to this large GO term association capture. Furthermore, we use our method to infer new relationships between GO terms solely from the topologies of these networks and validate 44% of our predictions in the literature. In addition, our integration method reproduces 48% of cellular component, 41% of molecular function and 41% of biological process GO terms, outperforming the previous method in the former two domains of GO. Finally, we predict new GO annotations of yeast genes and validate our predictions through GIs profiling. Availability and implementation: Supplementary Tables of new GO term associations and predicted gene annotations are available at http://bio-nets.doc.ic.ac.uk/GO-Reconstruction/. Contact: natasha@imperial.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-4230235
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-42302352014-11-13 Integration of molecular network data reconstructs Gene Ontology Gligorijević, Vladimir Janjić, Vuk Pržulj, Nataša Bioinformatics Eccb 2014 Proceedings Papers Committee Motivation: Recently, a shift was made from using Gene Ontology (GO) to evaluate molecular network data to using these data to construct and evaluate GO. Dutkowski et al. provide the first evidence that a large part of GO can be reconstructed solely from topologies of molecular networks. Motivated by this work, we develop a novel data integration framework that integrates multiple types of molecular network data to reconstruct and update GO. We ask how much of GO can be recovered by integrating various molecular interaction data. Results: We introduce a computational framework for integration of various biological networks using penalized non-negative matrix tri-factorization (PNMTF). It takes all network data in a matrix form and performs simultaneous clustering of genes and GO terms, inducing new relations between genes and GO terms (annotations) and between GO terms themselves. To improve the accuracy of our predicted relations, we extend the integration methodology to include additional topological information represented as the similarity in wiring around non-interacting genes. Surprisingly, by integrating topologies of bakers’ yeasts protein–protein interaction, genetic interaction (GI) and co-expression networks, our method reports as related 96% of GO terms that are directly related in GO. The inclusion of the wiring similarity of non-interacting genes contributes 6% to this large GO term association capture. Furthermore, we use our method to infer new relationships between GO terms solely from the topologies of these networks and validate 44% of our predictions in the literature. In addition, our integration method reproduces 48% of cellular component, 41% of molecular function and 41% of biological process GO terms, outperforming the previous method in the former two domains of GO. Finally, we predict new GO annotations of yeast genes and validate our predictions through GIs profiling. Availability and implementation: Supplementary Tables of new GO term associations and predicted gene annotations are available at http://bio-nets.doc.ic.ac.uk/GO-Reconstruction/. Contact: natasha@imperial.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2014-09-01 2014-08-22 /pmc/articles/PMC4230235/ /pubmed/25161252 http://dx.doi.org/10.1093/bioinformatics/btu470 Text en © The Author 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Eccb 2014 Proceedings Papers Committee
Gligorijević, Vladimir
Janjić, Vuk
Pržulj, Nataša
Integration of molecular network data reconstructs Gene Ontology
title Integration of molecular network data reconstructs Gene Ontology
title_full Integration of molecular network data reconstructs Gene Ontology
title_fullStr Integration of molecular network data reconstructs Gene Ontology
title_full_unstemmed Integration of molecular network data reconstructs Gene Ontology
title_short Integration of molecular network data reconstructs Gene Ontology
title_sort integration of molecular network data reconstructs gene ontology
topic Eccb 2014 Proceedings Papers Committee
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4230235/
https://www.ncbi.nlm.nih.gov/pubmed/25161252
http://dx.doi.org/10.1093/bioinformatics/btu470
work_keys_str_mv AT gligorijevicvladimir integrationofmolecularnetworkdatareconstructsgeneontology
AT janjicvuk integrationofmolecularnetworkdatareconstructsgeneontology
AT przuljnatasa integrationofmolecularnetworkdatareconstructsgeneontology