Cargando…
A scalable algorithm for structure identification of complex gene regulatory network from temporal expression data
BACKGROUND: Gene regulatory interactions are of fundamental importance to various biological functions and processes. However, only a few previous computational studies have claimed success in revealing genome-wide regulatory landscapes from temporal gene expression data, especially for complex euka...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5294888/ https://www.ncbi.nlm.nih.gov/pubmed/28143596 http://dx.doi.org/10.1186/s12859-017-1489-z |
_version_ | 1782505326316093440 |
---|---|
author | Gui, Shupeng Rice, Andrew P. Chen, Rui Wu, Liang Liu, Ji Miao, Hongyu |
author_facet | Gui, Shupeng Rice, Andrew P. Chen, Rui Wu, Liang Liu, Ji Miao, Hongyu |
author_sort | Gui, Shupeng |
collection | PubMed |
description | BACKGROUND: Gene regulatory interactions are of fundamental importance to various biological functions and processes. However, only a few previous computational studies have claimed success in revealing genome-wide regulatory landscapes from temporal gene expression data, especially for complex eukaryotes like human. Moreover, recent work suggests that these methods still suffer from the curse of dimensionality if a network size increases to 100 or higher. RESULTS: Here we present a novel scalable algorithm for identifying genome-wide gene regulatory network (GRN) structures, and we have verified the algorithm performances by extensive simulation studies based on the DREAM challenge benchmark data. The highlight of our method is that its superior performance does not degenerate even for a network size on the order of 10(4), and is thus readily applicable to large-scale complex networks. Such a breakthrough is achieved by considering both prior biological knowledge and multiple topological properties (i.e., sparsity and hub gene structure) of complex networks in the regularized formulation. We also validate and illustrate the application of our algorithm in practice using the time-course gene expression data from a study on human respiratory epithelial cells in response to influenza A virus (IAV) infection, as well as the CHIP-seq data from ENCODE on transcription factor (TF) and target gene interactions. An interesting finding, owing to the proposed algorithm, is that the biggest hub structures (e.g., top ten) in the GRN all center at some transcription factors in the context of epithelial cell infection by IAV. CONCLUSIONS: The proposed algorithm is the first scalable method for large complex network structure identification. The GRN structure identified by our algorithm could reveal possible biological links and help researchers to choose which gene functions to investigate in a biological event. The algorithm described in this article is implemented in MATLAB (Ⓡ), and the source code is freely available from https://github.com/Hongyu-Miao/DMI.git. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1489-z) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5294888 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-52948882017-02-09 A scalable algorithm for structure identification of complex gene regulatory network from temporal expression data Gui, Shupeng Rice, Andrew P. Chen, Rui Wu, Liang Liu, Ji Miao, Hongyu BMC Bioinformatics Methodology Article BACKGROUND: Gene regulatory interactions are of fundamental importance to various biological functions and processes. However, only a few previous computational studies have claimed success in revealing genome-wide regulatory landscapes from temporal gene expression data, especially for complex eukaryotes like human. Moreover, recent work suggests that these methods still suffer from the curse of dimensionality if a network size increases to 100 or higher. RESULTS: Here we present a novel scalable algorithm for identifying genome-wide gene regulatory network (GRN) structures, and we have verified the algorithm performances by extensive simulation studies based on the DREAM challenge benchmark data. The highlight of our method is that its superior performance does not degenerate even for a network size on the order of 10(4), and is thus readily applicable to large-scale complex networks. Such a breakthrough is achieved by considering both prior biological knowledge and multiple topological properties (i.e., sparsity and hub gene structure) of complex networks in the regularized formulation. We also validate and illustrate the application of our algorithm in practice using the time-course gene expression data from a study on human respiratory epithelial cells in response to influenza A virus (IAV) infection, as well as the CHIP-seq data from ENCODE on transcription factor (TF) and target gene interactions. An interesting finding, owing to the proposed algorithm, is that the biggest hub structures (e.g., top ten) in the GRN all center at some transcription factors in the context of epithelial cell infection by IAV. CONCLUSIONS: The proposed algorithm is the first scalable method for large complex network structure identification. The GRN structure identified by our algorithm could reveal possible biological links and help researchers to choose which gene functions to investigate in a biological event. The algorithm described in this article is implemented in MATLAB (Ⓡ), and the source code is freely available from https://github.com/Hongyu-Miao/DMI.git. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1489-z) contains supplementary material, which is available to authorized users. BioMed Central 2017-01-31 /pmc/articles/PMC5294888/ /pubmed/28143596 http://dx.doi.org/10.1186/s12859-017-1489-z Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Gui, Shupeng Rice, Andrew P. Chen, Rui Wu, Liang Liu, Ji Miao, Hongyu A scalable algorithm for structure identification of complex gene regulatory network from temporal expression data |
title | A scalable algorithm for structure identification of complex gene regulatory network from temporal expression data |
title_full | A scalable algorithm for structure identification of complex gene regulatory network from temporal expression data |
title_fullStr | A scalable algorithm for structure identification of complex gene regulatory network from temporal expression data |
title_full_unstemmed | A scalable algorithm for structure identification of complex gene regulatory network from temporal expression data |
title_short | A scalable algorithm for structure identification of complex gene regulatory network from temporal expression data |
title_sort | scalable algorithm for structure identification of complex gene regulatory network from temporal expression data |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5294888/ https://www.ncbi.nlm.nih.gov/pubmed/28143596 http://dx.doi.org/10.1186/s12859-017-1489-z |
work_keys_str_mv | AT guishupeng ascalablealgorithmforstructureidentificationofcomplexgeneregulatorynetworkfromtemporalexpressiondata AT riceandrewp ascalablealgorithmforstructureidentificationofcomplexgeneregulatorynetworkfromtemporalexpressiondata AT chenrui ascalablealgorithmforstructureidentificationofcomplexgeneregulatorynetworkfromtemporalexpressiondata AT wuliang ascalablealgorithmforstructureidentificationofcomplexgeneregulatorynetworkfromtemporalexpressiondata AT liuji ascalablealgorithmforstructureidentificationofcomplexgeneregulatorynetworkfromtemporalexpressiondata AT miaohongyu ascalablealgorithmforstructureidentificationofcomplexgeneregulatorynetworkfromtemporalexpressiondata AT guishupeng scalablealgorithmforstructureidentificationofcomplexgeneregulatorynetworkfromtemporalexpressiondata AT riceandrewp scalablealgorithmforstructureidentificationofcomplexgeneregulatorynetworkfromtemporalexpressiondata AT chenrui scalablealgorithmforstructureidentificationofcomplexgeneregulatorynetworkfromtemporalexpressiondata AT wuliang scalablealgorithmforstructureidentificationofcomplexgeneregulatorynetworkfromtemporalexpressiondata AT liuji scalablealgorithmforstructureidentificationofcomplexgeneregulatorynetworkfromtemporalexpressiondata AT miaohongyu scalablealgorithmforstructureidentificationofcomplexgeneregulatorynetworkfromtemporalexpressiondata |