Cargando…

A scalable algorithm for structure identification of complex gene regulatory network from temporal expression data

BACKGROUND: Gene regulatory interactions are of fundamental importance to various biological functions and processes. However, only a few previous computational studies have claimed success in revealing genome-wide regulatory landscapes from temporal gene expression data, especially for complex euka...

Descripción completa

Detalles Bibliográficos
Autores principales: Gui, Shupeng, Rice, Andrew P., Chen, Rui, Wu, Liang, Liu, Ji, Miao, Hongyu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5294888/
https://www.ncbi.nlm.nih.gov/pubmed/28143596
http://dx.doi.org/10.1186/s12859-017-1489-z
_version_ 1782505326316093440
author Gui, Shupeng
Rice, Andrew P.
Chen, Rui
Wu, Liang
Liu, Ji
Miao, Hongyu
author_facet Gui, Shupeng
Rice, Andrew P.
Chen, Rui
Wu, Liang
Liu, Ji
Miao, Hongyu
author_sort Gui, Shupeng
collection PubMed
description BACKGROUND: Gene regulatory interactions are of fundamental importance to various biological functions and processes. However, only a few previous computational studies have claimed success in revealing genome-wide regulatory landscapes from temporal gene expression data, especially for complex eukaryotes like human. Moreover, recent work suggests that these methods still suffer from the curse of dimensionality if a network size increases to 100 or higher. RESULTS: Here we present a novel scalable algorithm for identifying genome-wide gene regulatory network (GRN) structures, and we have verified the algorithm performances by extensive simulation studies based on the DREAM challenge benchmark data. The highlight of our method is that its superior performance does not degenerate even for a network size on the order of 10(4), and is thus readily applicable to large-scale complex networks. Such a breakthrough is achieved by considering both prior biological knowledge and multiple topological properties (i.e., sparsity and hub gene structure) of complex networks in the regularized formulation. We also validate and illustrate the application of our algorithm in practice using the time-course gene expression data from a study on human respiratory epithelial cells in response to influenza A virus (IAV) infection, as well as the CHIP-seq data from ENCODE on transcription factor (TF) and target gene interactions. An interesting finding, owing to the proposed algorithm, is that the biggest hub structures (e.g., top ten) in the GRN all center at some transcription factors in the context of epithelial cell infection by IAV. CONCLUSIONS: The proposed algorithm is the first scalable method for large complex network structure identification. The GRN structure identified by our algorithm could reveal possible biological links and help researchers to choose which gene functions to investigate in a biological event. The algorithm described in this article is implemented in MATLAB (Ⓡ), and the source code is freely available from https://github.com/Hongyu-Miao/DMI.git. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1489-z) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5294888
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-52948882017-02-09 A scalable algorithm for structure identification of complex gene regulatory network from temporal expression data Gui, Shupeng Rice, Andrew P. Chen, Rui Wu, Liang Liu, Ji Miao, Hongyu BMC Bioinformatics Methodology Article BACKGROUND: Gene regulatory interactions are of fundamental importance to various biological functions and processes. However, only a few previous computational studies have claimed success in revealing genome-wide regulatory landscapes from temporal gene expression data, especially for complex eukaryotes like human. Moreover, recent work suggests that these methods still suffer from the curse of dimensionality if a network size increases to 100 or higher. RESULTS: Here we present a novel scalable algorithm for identifying genome-wide gene regulatory network (GRN) structures, and we have verified the algorithm performances by extensive simulation studies based on the DREAM challenge benchmark data. The highlight of our method is that its superior performance does not degenerate even for a network size on the order of 10(4), and is thus readily applicable to large-scale complex networks. Such a breakthrough is achieved by considering both prior biological knowledge and multiple topological properties (i.e., sparsity and hub gene structure) of complex networks in the regularized formulation. We also validate and illustrate the application of our algorithm in practice using the time-course gene expression data from a study on human respiratory epithelial cells in response to influenza A virus (IAV) infection, as well as the CHIP-seq data from ENCODE on transcription factor (TF) and target gene interactions. An interesting finding, owing to the proposed algorithm, is that the biggest hub structures (e.g., top ten) in the GRN all center at some transcription factors in the context of epithelial cell infection by IAV. CONCLUSIONS: The proposed algorithm is the first scalable method for large complex network structure identification. The GRN structure identified by our algorithm could reveal possible biological links and help researchers to choose which gene functions to investigate in a biological event. The algorithm described in this article is implemented in MATLAB (Ⓡ), and the source code is freely available from https://github.com/Hongyu-Miao/DMI.git. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1489-z) contains supplementary material, which is available to authorized users. BioMed Central 2017-01-31 /pmc/articles/PMC5294888/ /pubmed/28143596 http://dx.doi.org/10.1186/s12859-017-1489-z Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Gui, Shupeng
Rice, Andrew P.
Chen, Rui
Wu, Liang
Liu, Ji
Miao, Hongyu
A scalable algorithm for structure identification of complex gene regulatory network from temporal expression data
title A scalable algorithm for structure identification of complex gene regulatory network from temporal expression data
title_full A scalable algorithm for structure identification of complex gene regulatory network from temporal expression data
title_fullStr A scalable algorithm for structure identification of complex gene regulatory network from temporal expression data
title_full_unstemmed A scalable algorithm for structure identification of complex gene regulatory network from temporal expression data
title_short A scalable algorithm for structure identification of complex gene regulatory network from temporal expression data
title_sort scalable algorithm for structure identification of complex gene regulatory network from temporal expression data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5294888/
https://www.ncbi.nlm.nih.gov/pubmed/28143596
http://dx.doi.org/10.1186/s12859-017-1489-z
work_keys_str_mv AT guishupeng ascalablealgorithmforstructureidentificationofcomplexgeneregulatorynetworkfromtemporalexpressiondata
AT riceandrewp ascalablealgorithmforstructureidentificationofcomplexgeneregulatorynetworkfromtemporalexpressiondata
AT chenrui ascalablealgorithmforstructureidentificationofcomplexgeneregulatorynetworkfromtemporalexpressiondata
AT wuliang ascalablealgorithmforstructureidentificationofcomplexgeneregulatorynetworkfromtemporalexpressiondata
AT liuji ascalablealgorithmforstructureidentificationofcomplexgeneregulatorynetworkfromtemporalexpressiondata
AT miaohongyu ascalablealgorithmforstructureidentificationofcomplexgeneregulatorynetworkfromtemporalexpressiondata
AT guishupeng scalablealgorithmforstructureidentificationofcomplexgeneregulatorynetworkfromtemporalexpressiondata
AT riceandrewp scalablealgorithmforstructureidentificationofcomplexgeneregulatorynetworkfromtemporalexpressiondata
AT chenrui scalablealgorithmforstructureidentificationofcomplexgeneregulatorynetworkfromtemporalexpressiondata
AT wuliang scalablealgorithmforstructureidentificationofcomplexgeneregulatorynetworkfromtemporalexpressiondata
AT liuji scalablealgorithmforstructureidentificationofcomplexgeneregulatorynetworkfromtemporalexpressiondata
AT miaohongyu scalablealgorithmforstructureidentificationofcomplexgeneregulatorynetworkfromtemporalexpressiondata