Cargando…

Augmenting Microarray Data with Literature-Based Knowledge to Enhance Gene Regulatory Network Inference

Gene regulatory networks are a crucial aspect of systems biology in describing molecular mechanisms of the cell. Various computational models rely on random gene selection to infer such networks from microarray data. While incorporation of prior knowledge into data analysis has been deemed important...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Guocai, Cairelli, Michael J., Kilicoglu, Halil, Shin, Dongwook, Rindflesch, Thomas C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4055569/
https://www.ncbi.nlm.nih.gov/pubmed/24921649
http://dx.doi.org/10.1371/journal.pcbi.1003666
_version_ 1782320678687473664
author Chen, Guocai
Cairelli, Michael J.
Kilicoglu, Halil
Shin, Dongwook
Rindflesch, Thomas C.
author_facet Chen, Guocai
Cairelli, Michael J.
Kilicoglu, Halil
Shin, Dongwook
Rindflesch, Thomas C.
author_sort Chen, Guocai
collection PubMed
description Gene regulatory networks are a crucial aspect of systems biology in describing molecular mechanisms of the cell. Various computational models rely on random gene selection to infer such networks from microarray data. While incorporation of prior knowledge into data analysis has been deemed important, in practice, it has generally been limited to referencing genes in probe sets and using curated knowledge bases. We investigate the impact of augmenting microarray data with semantic relations automatically extracted from the literature, with the view that relations encoding gene/protein interactions eliminate the need for random selection of components in non-exhaustive approaches, producing a more accurate model of cellular behavior. A genetic algorithm is then used to optimize the strength of interactions using microarray data and an artificial neural network fitness function. The result is a directed and weighted network providing the individual contribution of each gene to its target. For testing, we used invasive ductile carcinoma of the breast to query the literature and a microarray set containing gene expression changes in these cells over several time points. Our model demonstrates significantly better fitness than the state-of-the-art model, which relies on an initial random selection of genes. Comparison to the component pathways of the KEGG Pathways in Cancer map reveals that the resulting networks contain both known and novel relationships. The p53 pathway results were manually validated in the literature. 60% of non-KEGG relationships were supported (74% for highly weighted interactions). The method was then applied to yeast data and our model again outperformed the comparison model. Our results demonstrate the advantage of combining gene interactions extracted from the literature in the form of semantic relations with microarray analysis in generating contribution-weighted gene regulatory networks. This methodology can make a significant contribution to understanding the complex interactions involved in cellular behavior and molecular physiology.
format Online
Article
Text
id pubmed-4055569
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-40555692014-06-18 Augmenting Microarray Data with Literature-Based Knowledge to Enhance Gene Regulatory Network Inference Chen, Guocai Cairelli, Michael J. Kilicoglu, Halil Shin, Dongwook Rindflesch, Thomas C. PLoS Comput Biol Research Article Gene regulatory networks are a crucial aspect of systems biology in describing molecular mechanisms of the cell. Various computational models rely on random gene selection to infer such networks from microarray data. While incorporation of prior knowledge into data analysis has been deemed important, in practice, it has generally been limited to referencing genes in probe sets and using curated knowledge bases. We investigate the impact of augmenting microarray data with semantic relations automatically extracted from the literature, with the view that relations encoding gene/protein interactions eliminate the need for random selection of components in non-exhaustive approaches, producing a more accurate model of cellular behavior. A genetic algorithm is then used to optimize the strength of interactions using microarray data and an artificial neural network fitness function. The result is a directed and weighted network providing the individual contribution of each gene to its target. For testing, we used invasive ductile carcinoma of the breast to query the literature and a microarray set containing gene expression changes in these cells over several time points. Our model demonstrates significantly better fitness than the state-of-the-art model, which relies on an initial random selection of genes. Comparison to the component pathways of the KEGG Pathways in Cancer map reveals that the resulting networks contain both known and novel relationships. The p53 pathway results were manually validated in the literature. 60% of non-KEGG relationships were supported (74% for highly weighted interactions). The method was then applied to yeast data and our model again outperformed the comparison model. Our results demonstrate the advantage of combining gene interactions extracted from the literature in the form of semantic relations with microarray analysis in generating contribution-weighted gene regulatory networks. This methodology can make a significant contribution to understanding the complex interactions involved in cellular behavior and molecular physiology. Public Library of Science 2014-06-12 /pmc/articles/PMC4055569/ /pubmed/24921649 http://dx.doi.org/10.1371/journal.pcbi.1003666 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration, which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose.
spellingShingle Research Article
Chen, Guocai
Cairelli, Michael J.
Kilicoglu, Halil
Shin, Dongwook
Rindflesch, Thomas C.
Augmenting Microarray Data with Literature-Based Knowledge to Enhance Gene Regulatory Network Inference
title Augmenting Microarray Data with Literature-Based Knowledge to Enhance Gene Regulatory Network Inference
title_full Augmenting Microarray Data with Literature-Based Knowledge to Enhance Gene Regulatory Network Inference
title_fullStr Augmenting Microarray Data with Literature-Based Knowledge to Enhance Gene Regulatory Network Inference
title_full_unstemmed Augmenting Microarray Data with Literature-Based Knowledge to Enhance Gene Regulatory Network Inference
title_short Augmenting Microarray Data with Literature-Based Knowledge to Enhance Gene Regulatory Network Inference
title_sort augmenting microarray data with literature-based knowledge to enhance gene regulatory network inference
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4055569/
https://www.ncbi.nlm.nih.gov/pubmed/24921649
http://dx.doi.org/10.1371/journal.pcbi.1003666
work_keys_str_mv AT chenguocai augmentingmicroarraydatawithliteraturebasedknowledgetoenhancegeneregulatorynetworkinference
AT cairellimichaelj augmentingmicroarraydatawithliteraturebasedknowledgetoenhancegeneregulatorynetworkinference
AT kilicogluhalil augmentingmicroarraydatawithliteraturebasedknowledgetoenhancegeneregulatorynetworkinference
AT shindongwook augmentingmicroarraydatawithliteraturebasedknowledgetoenhancegeneregulatorynetworkinference
AT rindfleschthomasc augmentingmicroarraydatawithliteraturebasedknowledgetoenhancegeneregulatorynetworkinference