Cargando…

Modeling gene expression regulatory networks with the sparse vector autoregressive model

BACKGROUND: To understand the molecular mechanisms underlying important biological processes, a detailed description of the gene products networks involved is required. In order to define and understand such molecular networks, some statistical methods are proposed in the literature to estimate gene...

Descripción completa

Detalles Bibliográficos
Autores principales: Fujita, André, Sato, João R, Garay-Malpartida, Humberto M, Yamaguchi, Rui, Miyano, Satoru, Sogayar, Mari C, Ferreira, Carlos E
Formato: Texto
Lenguaje:English
Publicado: BioMed Central|1 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2048982/
https://www.ncbi.nlm.nih.gov/pubmed/17761000
http://dx.doi.org/10.1186/1752-0509-1-39
_version_ 1782137174313926656
author Fujita, André
Sato, João R
Garay-Malpartida, Humberto M
Yamaguchi, Rui
Miyano, Satoru
Sogayar, Mari C
Ferreira, Carlos E
author_facet Fujita, André
Sato, João R
Garay-Malpartida, Humberto M
Yamaguchi, Rui
Miyano, Satoru
Sogayar, Mari C
Ferreira, Carlos E
author_sort Fujita, André
collection PubMed
description BACKGROUND: To understand the molecular mechanisms underlying important biological processes, a detailed description of the gene products networks involved is required. In order to define and understand such molecular networks, some statistical methods are proposed in the literature to estimate gene regulatory networks from time-series microarray data. However, several problems still need to be overcome. Firstly, information flow need to be inferred, in addition to the correlation between genes. Secondly, we usually try to identify large networks from a large number of genes (parameters) originating from a smaller number of microarray experiments (samples). Due to this situation, which is rather frequent in Bioinformatics, it is difficult to perform statistical tests using methods that model large gene-gene networks. In addition, most of the models are based on dimension reduction using clustering techniques, therefore, the resulting network is not a gene-gene network but a module-module network. Here, we present the Sparse Vector Autoregressive model as a solution to these problems. RESULTS: We have applied the Sparse Vector Autoregressive model to estimate gene regulatory networks based on gene expression profiles obtained from time-series microarray experiments. Through extensive simulations, by applying the SVAR method to artificial regulatory networks, we show that SVAR can infer true positive edges even under conditions in which the number of samples is smaller than the number of genes. Moreover, it is possible to control for false positives, a significant advantage when compared to other methods described in the literature, which are based on ranks or score functions. By applying SVAR to actual HeLa cell cycle gene expression data, we were able to identify well known transcription factor targets. CONCLUSION: The proposed SVAR method is able to model gene regulatory networks in frequent situations in which the number of samples is lower than the number of genes, making it possible to naturally infer partial Granger causalities without any a priori information. In addition, we present a statistical test to control the false discovery rate, which was not previously possible using other gene regulatory network models.
format Text
id pubmed-2048982
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central|1
record_format MEDLINE/PubMed
spelling pubmed-20489822007-11-03 Modeling gene expression regulatory networks with the sparse vector autoregressive model Fujita, André Sato, João R Garay-Malpartida, Humberto M Yamaguchi, Rui Miyano, Satoru Sogayar, Mari C Ferreira, Carlos E BMC Syst Biol Methodology Article BACKGROUND: To understand the molecular mechanisms underlying important biological processes, a detailed description of the gene products networks involved is required. In order to define and understand such molecular networks, some statistical methods are proposed in the literature to estimate gene regulatory networks from time-series microarray data. However, several problems still need to be overcome. Firstly, information flow need to be inferred, in addition to the correlation between genes. Secondly, we usually try to identify large networks from a large number of genes (parameters) originating from a smaller number of microarray experiments (samples). Due to this situation, which is rather frequent in Bioinformatics, it is difficult to perform statistical tests using methods that model large gene-gene networks. In addition, most of the models are based on dimension reduction using clustering techniques, therefore, the resulting network is not a gene-gene network but a module-module network. Here, we present the Sparse Vector Autoregressive model as a solution to these problems. RESULTS: We have applied the Sparse Vector Autoregressive model to estimate gene regulatory networks based on gene expression profiles obtained from time-series microarray experiments. Through extensive simulations, by applying the SVAR method to artificial regulatory networks, we show that SVAR can infer true positive edges even under conditions in which the number of samples is smaller than the number of genes. Moreover, it is possible to control for false positives, a significant advantage when compared to other methods described in the literature, which are based on ranks or score functions. By applying SVAR to actual HeLa cell cycle gene expression data, we were able to identify well known transcription factor targets. CONCLUSION: The proposed SVAR method is able to model gene regulatory networks in frequent situations in which the number of samples is lower than the number of genes, making it possible to naturally infer partial Granger causalities without any a priori information. In addition, we present a statistical test to control the false discovery rate, which was not previously possible using other gene regulatory network models. BioMed Central|1 2007-08-30 /pmc/articles/PMC2048982/ /pubmed/17761000 http://dx.doi.org/10.1186/1752-0509-1-39 Text en Copyright © 2007 Fujita et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Fujita, André
Sato, João R
Garay-Malpartida, Humberto M
Yamaguchi, Rui
Miyano, Satoru
Sogayar, Mari C
Ferreira, Carlos E
Modeling gene expression regulatory networks with the sparse vector autoregressive model
title Modeling gene expression regulatory networks with the sparse vector autoregressive model
title_full Modeling gene expression regulatory networks with the sparse vector autoregressive model
title_fullStr Modeling gene expression regulatory networks with the sparse vector autoregressive model
title_full_unstemmed Modeling gene expression regulatory networks with the sparse vector autoregressive model
title_short Modeling gene expression regulatory networks with the sparse vector autoregressive model
title_sort modeling gene expression regulatory networks with the sparse vector autoregressive model
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2048982/
https://www.ncbi.nlm.nih.gov/pubmed/17761000
http://dx.doi.org/10.1186/1752-0509-1-39
work_keys_str_mv AT fujitaandre modelinggeneexpressionregulatorynetworkswiththesparsevectorautoregressivemodel
AT satojoaor modelinggeneexpressionregulatorynetworkswiththesparsevectorautoregressivemodel
AT garaymalpartidahumbertom modelinggeneexpressionregulatorynetworkswiththesparsevectorautoregressivemodel
AT yamaguchirui modelinggeneexpressionregulatorynetworkswiththesparsevectorautoregressivemodel
AT miyanosatoru modelinggeneexpressionregulatorynetworkswiththesparsevectorautoregressivemodel
AT sogayarmaric modelinggeneexpressionregulatorynetworkswiththesparsevectorautoregressivemodel
AT ferreiracarlose modelinggeneexpressionregulatorynetworkswiththesparsevectorautoregressivemodel