Cargando…

Relevance of different prior knowledge sources for inferring gene interaction networks

When inferring networks from high-throughput genomic data, one of the main challenges is the subsequent validation of these networks. In the best case scenario, the true network is partially known from previous research results published in structured databases or research articles. Traditionally, i...

Descripción completa

Detalles Bibliográficos
Autores principales:	Olsen, Catharina, Bontempi, Gianluca, Emmert-Streib, Frank, Quackenbush, John, Haibe-Kains, Benjamin
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2014
Materias:	Genetics
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4067568/ https://www.ncbi.nlm.nih.gov/pubmed/25009552 http://dx.doi.org/10.3389/fgene.2014.00177

_version_	1782322308429381632
author	Olsen, Catharina Bontempi, Gianluca Emmert-Streib, Frank Quackenbush, John Haibe-Kains, Benjamin
author_facet	Olsen, Catharina Bontempi, Gianluca Emmert-Streib, Frank Quackenbush, John Haibe-Kains, Benjamin
author_sort	Olsen, Catharina
collection	PubMed
description	When inferring networks from high-throughput genomic data, one of the main challenges is the subsequent validation of these networks. In the best case scenario, the true network is partially known from previous research results published in structured databases or research articles. Traditionally, inferred networks are validated against these known interactions. Whenever the recovery rate is gauged to be high enough, subsequent high scoring but unknown inferred interactions are deemed good candidates for further experimental validation. Therefore such validation framework strongly depends on the quantity and quality of published interactions and presents serious pitfalls: (1) availability of these known interactions for the studied problem might be sparse; (2) quantitatively comparing different inference algorithms is not trivial; and (3) the use of these known interactions for validation prevents their integration in the inference procedure. The latter is particularly relevant as it has recently been showed that integration of priors during network inference significantly improves the quality of inferred networks. To overcome these problems when validating inferred networks, we recently proposed a data-driven validation framework based on single gene knock-down experiments. Using this framework, we were able to demonstrate the benefits of integrating prior knowledge and expression data. In this paper we used this framework to assess the quality of different sources of prior knowledge on their own and in combination with different genomic data sets in colorectal cancer. We observed that most prior sources lead to significant F-scores. Furthermore, their integration with genomic data leads to a significant increase in F-scores, especially for priors extracted from full text PubMed articles, known co-expression modules and genetic interactions. Lastly, we observed that the results are consistent for three different data sets: experimental knock-down data and two human tumor data sets.
format	Online Article Text
id	pubmed-4067568
institution	National Center for Biotechnology Information
language	English
publishDate	2014
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-40675682014-07-09 Relevance of different prior knowledge sources for inferring gene interaction networks Olsen, Catharina Bontempi, Gianluca Emmert-Streib, Frank Quackenbush, John Haibe-Kains, Benjamin Front Genet Genetics When inferring networks from high-throughput genomic data, one of the main challenges is the subsequent validation of these networks. In the best case scenario, the true network is partially known from previous research results published in structured databases or research articles. Traditionally, inferred networks are validated against these known interactions. Whenever the recovery rate is gauged to be high enough, subsequent high scoring but unknown inferred interactions are deemed good candidates for further experimental validation. Therefore such validation framework strongly depends on the quantity and quality of published interactions and presents serious pitfalls: (1) availability of these known interactions for the studied problem might be sparse; (2) quantitatively comparing different inference algorithms is not trivial; and (3) the use of these known interactions for validation prevents their integration in the inference procedure. The latter is particularly relevant as it has recently been showed that integration of priors during network inference significantly improves the quality of inferred networks. To overcome these problems when validating inferred networks, we recently proposed a data-driven validation framework based on single gene knock-down experiments. Using this framework, we were able to demonstrate the benefits of integrating prior knowledge and expression data. In this paper we used this framework to assess the quality of different sources of prior knowledge on their own and in combination with different genomic data sets in colorectal cancer. We observed that most prior sources lead to significant F-scores. Furthermore, their integration with genomic data leads to a significant increase in F-scores, especially for priors extracted from full text PubMed articles, known co-expression modules and genetic interactions. Lastly, we observed that the results are consistent for three different data sets: experimental knock-down data and two human tumor data sets. Frontiers Media S.A. 2014-06-24 /pmc/articles/PMC4067568/ /pubmed/25009552 http://dx.doi.org/10.3389/fgene.2014.00177 Text en Copyright © 2014 Olsen, Bontempi, Emmert-Streib, Quackenbush and Haibe-Kains. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Genetics Olsen, Catharina Bontempi, Gianluca Emmert-Streib, Frank Quackenbush, John Haibe-Kains, Benjamin Relevance of different prior knowledge sources for inferring gene interaction networks
title	Relevance of different prior knowledge sources for inferring gene interaction networks
title_full	Relevance of different prior knowledge sources for inferring gene interaction networks
title_fullStr	Relevance of different prior knowledge sources for inferring gene interaction networks
title_full_unstemmed	Relevance of different prior knowledge sources for inferring gene interaction networks
title_short	Relevance of different prior knowledge sources for inferring gene interaction networks
title_sort	relevance of different prior knowledge sources for inferring gene interaction networks
topic	Genetics
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4067568/ https://www.ncbi.nlm.nih.gov/pubmed/25009552 http://dx.doi.org/10.3389/fgene.2014.00177
work_keys_str_mv	AT olsencatharina relevanceofdifferentpriorknowledgesourcesforinferringgeneinteractionnetworks AT bontempigianluca relevanceofdifferentpriorknowledgesourcesforinferringgeneinteractionnetworks AT emmertstreibfrank relevanceofdifferentpriorknowledgesourcesforinferringgeneinteractionnetworks AT quackenbushjohn relevanceofdifferentpriorknowledgesourcesforinferringgeneinteractionnetworks AT haibekainsbenjamin relevanceofdifferentpriorknowledgesourcesforinferringgeneinteractionnetworks

Relevance of different prior knowledge sources for inferring gene interaction networks

Ejemplares similares