Cargando…

Learning causal networks from systems biology time course data: an effective model selection procedure for the vector autoregressive process

BACKGROUND: Causal networks based on the vector autoregressive (VAR) process are a promising statistical tool for modeling regulatory interactions in a cell. However, learning these networks is challenging due to the low sample size and high dimensionality of genomic data. RESULTS: We present a nove...

Descripción completa

Detalles Bibliográficos
Autores principales: Opgen-Rhein, Rainer, Strimmer, Korbinian
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1892072/
https://www.ncbi.nlm.nih.gov/pubmed/17493252
http://dx.doi.org/10.1186/1471-2105-8-S2-S3
_version_ 1782133820639674368
author Opgen-Rhein, Rainer
Strimmer, Korbinian
author_facet Opgen-Rhein, Rainer
Strimmer, Korbinian
author_sort Opgen-Rhein, Rainer
collection PubMed
description BACKGROUND: Causal networks based on the vector autoregressive (VAR) process are a promising statistical tool for modeling regulatory interactions in a cell. However, learning these networks is challenging due to the low sample size and high dimensionality of genomic data. RESULTS: We present a novel and highly efficient approach to estimate a VAR network. This proceeds in two steps: (i) improved estimation of VAR regression coefficients using an analytic shrinkage approach, and (ii) subsequent model selection by testing the associated partial correlations. In simulations this approach outperformed for small sample size all other considered approaches in terms of true discovery rate (number of correctly identified edges relative to the significant edges). Moreover, the analysis of expression time series data from Arabidopsis thaliana resulted in a biologically sensible network. CONCLUSION: Statistical learning of large-scale VAR causal models can be done efficiently by the proposed procedure, even in the difficult data situations prevalent in genomics and proteomics. AVAILABILITY: The method is implemented in R code that is available from the authors on request.
format Text
id pubmed-1892072
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-18920722007-06-15 Learning causal networks from systems biology time course data: an effective model selection procedure for the vector autoregressive process Opgen-Rhein, Rainer Strimmer, Korbinian BMC Bioinformatics Research BACKGROUND: Causal networks based on the vector autoregressive (VAR) process are a promising statistical tool for modeling regulatory interactions in a cell. However, learning these networks is challenging due to the low sample size and high dimensionality of genomic data. RESULTS: We present a novel and highly efficient approach to estimate a VAR network. This proceeds in two steps: (i) improved estimation of VAR regression coefficients using an analytic shrinkage approach, and (ii) subsequent model selection by testing the associated partial correlations. In simulations this approach outperformed for small sample size all other considered approaches in terms of true discovery rate (number of correctly identified edges relative to the significant edges). Moreover, the analysis of expression time series data from Arabidopsis thaliana resulted in a biologically sensible network. CONCLUSION: Statistical learning of large-scale VAR causal models can be done efficiently by the proposed procedure, even in the difficult data situations prevalent in genomics and proteomics. AVAILABILITY: The method is implemented in R code that is available from the authors on request. BioMed Central 2007-05-03 /pmc/articles/PMC1892072/ /pubmed/17493252 http://dx.doi.org/10.1186/1471-2105-8-S2-S3 Text en Copyright © 2007 Opgen-Rhein and Strimmer; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Opgen-Rhein, Rainer
Strimmer, Korbinian
Learning causal networks from systems biology time course data: an effective model selection procedure for the vector autoregressive process
title Learning causal networks from systems biology time course data: an effective model selection procedure for the vector autoregressive process
title_full Learning causal networks from systems biology time course data: an effective model selection procedure for the vector autoregressive process
title_fullStr Learning causal networks from systems biology time course data: an effective model selection procedure for the vector autoregressive process
title_full_unstemmed Learning causal networks from systems biology time course data: an effective model selection procedure for the vector autoregressive process
title_short Learning causal networks from systems biology time course data: an effective model selection procedure for the vector autoregressive process
title_sort learning causal networks from systems biology time course data: an effective model selection procedure for the vector autoregressive process
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1892072/
https://www.ncbi.nlm.nih.gov/pubmed/17493252
http://dx.doi.org/10.1186/1471-2105-8-S2-S3
work_keys_str_mv AT opgenrheinrainer learningcausalnetworksfromsystemsbiologytimecoursedataaneffectivemodelselectionprocedureforthevectorautoregressiveprocess
AT strimmerkorbinian learningcausalnetworksfromsystemsbiologytimecoursedataaneffectivemodelselectionprocedureforthevectorautoregressiveprocess