Cargando…

MICRAT: a novel algorithm for inferring gene regulatory networks using time series gene expression data

BACKGROUND: Reconstruction of gene regulatory networks (GRNs), also known as reverse engineering of GRNs, aims to infer the potential regulation relationships between genes. With the development of biotechnology, such as gene chip microarray and RNA-sequencing, the high-throughput data generated pro...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Bei, Xu, Yaohui, Maxwell, Andrew, Koh, Wonryull, Gong, Ping, Zhang, Chaoyang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6293491/
https://www.ncbi.nlm.nih.gov/pubmed/30547796
http://dx.doi.org/10.1186/s12918-018-0635-1
_version_ 1783380543596920832
author Yang, Bei
Xu, Yaohui
Maxwell, Andrew
Koh, Wonryull
Gong, Ping
Zhang, Chaoyang
author_facet Yang, Bei
Xu, Yaohui
Maxwell, Andrew
Koh, Wonryull
Gong, Ping
Zhang, Chaoyang
author_sort Yang, Bei
collection PubMed
description BACKGROUND: Reconstruction of gene regulatory networks (GRNs), also known as reverse engineering of GRNs, aims to infer the potential regulation relationships between genes. With the development of biotechnology, such as gene chip microarray and RNA-sequencing, the high-throughput data generated provide us with more opportunities to infer the gene-gene interaction relationships using gene expression data and hence understand the underlying mechanism of biological processes. Gene regulatory networks are known to exhibit a multiplicity of interaction mechanisms which include functional and non-functional, and linear and non-linear relationships. Meanwhile, the regulatory interactions between genes and gene products are not spontaneous since various processes involved in producing fully functional and measurable concentrations of transcriptional factors/proteins lead to a delay in gene regulation. Many different approaches for reconstructing GRNs have been proposed, but the existing GRN inference approaches such as probabilistic Boolean networks and dynamic Bayesian networks have various limitations and relatively low accuracy. Inferring GRNs from time series microarray data or RNA-sequencing data remains a very challenging inverse problem due to its nonlinearity, high dimensionality, sparse and noisy data, and significant computational cost, which motivates us to develop more effective inference methods. RESULTS: We developed a novel algorithm, MICRAT (Maximal Information coefficient with Conditional Relative Average entropy and Time-series mutual information), for inferring GRNs from time series gene expression data. Maximal information coefficient (MIC) is an effective measure of dependence for two-variable relationships. It captures a wide range of associations, both functional and non-functional, and thus has good performance on measuring the dependence between two genes. Our approach mainly includes two procedures. Firstly, it employs maximal information coefficient for constructing an undirected graph to represent the underlying relationships between genes. Secondly, it directs the edges in the undirected graph for inferring regulators and their targets. In this procedure, the conditional relative average entropies of each pair of nodes (or genes) are employed to indicate the directions of edges. Since the time delay might exist in the expression of regulators and target genes, time series mutual information is combined to cooperatively direct the edges for inferring the potential regulators and their targets. We evaluated the performance of MICRAT by applying it to synthetic datasets as well as real gene expression data and compare with other GRN inference methods. We inferred five 10-gene and five 100-gene networks from the DREAM4 challenge that were generated using the gene expression simulator GeneNetWeaver (GNW). MICRAT was also used to reconstruct GRNs on real gene expression data including part of the DNA-damaged response pathway (SOS DNA repair network) and experimental dataset in E. Coli. The results showed that MICRAT significantly improved the inference accuracy, compared to other inference methods, such as TDBN, etc. CONCLUSION: In this work, a novel algorithm, MICRAT, for inferring GRNs from time series gene expression data was proposed by taking into account dependence and time delay of expressions of a regulator and its target genes. This approach employed maximal information coefficients for reconstructing an undirected graph to represent the underlying relationships between genes. The edges were directed by combining conditional relative average entropy with time course mutual information of pairs of genes. The proposed algorithm was evaluated on the benchmark GRNs provided by the DREAM4 challenge and part of the real SOS DNA repair network in E. Coli. The experimental study showed that our approach was comparable to other methods on 10-gene datasets and outperformed other methods on 100-gene datasets in GRN inference from time series datasets.
format Online
Article
Text
id pubmed-6293491
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-62934912018-12-17 MICRAT: a novel algorithm for inferring gene regulatory networks using time series gene expression data Yang, Bei Xu, Yaohui Maxwell, Andrew Koh, Wonryull Gong, Ping Zhang, Chaoyang BMC Syst Biol Research BACKGROUND: Reconstruction of gene regulatory networks (GRNs), also known as reverse engineering of GRNs, aims to infer the potential regulation relationships between genes. With the development of biotechnology, such as gene chip microarray and RNA-sequencing, the high-throughput data generated provide us with more opportunities to infer the gene-gene interaction relationships using gene expression data and hence understand the underlying mechanism of biological processes. Gene regulatory networks are known to exhibit a multiplicity of interaction mechanisms which include functional and non-functional, and linear and non-linear relationships. Meanwhile, the regulatory interactions between genes and gene products are not spontaneous since various processes involved in producing fully functional and measurable concentrations of transcriptional factors/proteins lead to a delay in gene regulation. Many different approaches for reconstructing GRNs have been proposed, but the existing GRN inference approaches such as probabilistic Boolean networks and dynamic Bayesian networks have various limitations and relatively low accuracy. Inferring GRNs from time series microarray data or RNA-sequencing data remains a very challenging inverse problem due to its nonlinearity, high dimensionality, sparse and noisy data, and significant computational cost, which motivates us to develop more effective inference methods. RESULTS: We developed a novel algorithm, MICRAT (Maximal Information coefficient with Conditional Relative Average entropy and Time-series mutual information), for inferring GRNs from time series gene expression data. Maximal information coefficient (MIC) is an effective measure of dependence for two-variable relationships. It captures a wide range of associations, both functional and non-functional, and thus has good performance on measuring the dependence between two genes. Our approach mainly includes two procedures. Firstly, it employs maximal information coefficient for constructing an undirected graph to represent the underlying relationships between genes. Secondly, it directs the edges in the undirected graph for inferring regulators and their targets. In this procedure, the conditional relative average entropies of each pair of nodes (or genes) are employed to indicate the directions of edges. Since the time delay might exist in the expression of regulators and target genes, time series mutual information is combined to cooperatively direct the edges for inferring the potential regulators and their targets. We evaluated the performance of MICRAT by applying it to synthetic datasets as well as real gene expression data and compare with other GRN inference methods. We inferred five 10-gene and five 100-gene networks from the DREAM4 challenge that were generated using the gene expression simulator GeneNetWeaver (GNW). MICRAT was also used to reconstruct GRNs on real gene expression data including part of the DNA-damaged response pathway (SOS DNA repair network) and experimental dataset in E. Coli. The results showed that MICRAT significantly improved the inference accuracy, compared to other inference methods, such as TDBN, etc. CONCLUSION: In this work, a novel algorithm, MICRAT, for inferring GRNs from time series gene expression data was proposed by taking into account dependence and time delay of expressions of a regulator and its target genes. This approach employed maximal information coefficients for reconstructing an undirected graph to represent the underlying relationships between genes. The edges were directed by combining conditional relative average entropy with time course mutual information of pairs of genes. The proposed algorithm was evaluated on the benchmark GRNs provided by the DREAM4 challenge and part of the real SOS DNA repair network in E. Coli. The experimental study showed that our approach was comparable to other methods on 10-gene datasets and outperformed other methods on 100-gene datasets in GRN inference from time series datasets. BioMed Central 2018-12-14 /pmc/articles/PMC6293491/ /pubmed/30547796 http://dx.doi.org/10.1186/s12918-018-0635-1 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Yang, Bei
Xu, Yaohui
Maxwell, Andrew
Koh, Wonryull
Gong, Ping
Zhang, Chaoyang
MICRAT: a novel algorithm for inferring gene regulatory networks using time series gene expression data
title MICRAT: a novel algorithm for inferring gene regulatory networks using time series gene expression data
title_full MICRAT: a novel algorithm for inferring gene regulatory networks using time series gene expression data
title_fullStr MICRAT: a novel algorithm for inferring gene regulatory networks using time series gene expression data
title_full_unstemmed MICRAT: a novel algorithm for inferring gene regulatory networks using time series gene expression data
title_short MICRAT: a novel algorithm for inferring gene regulatory networks using time series gene expression data
title_sort micrat: a novel algorithm for inferring gene regulatory networks using time series gene expression data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6293491/
https://www.ncbi.nlm.nih.gov/pubmed/30547796
http://dx.doi.org/10.1186/s12918-018-0635-1
work_keys_str_mv AT yangbei micratanovelalgorithmforinferringgeneregulatorynetworksusingtimeseriesgeneexpressiondata
AT xuyaohui micratanovelalgorithmforinferringgeneregulatorynetworksusingtimeseriesgeneexpressiondata
AT maxwellandrew micratanovelalgorithmforinferringgeneregulatorynetworksusingtimeseriesgeneexpressiondata
AT kohwonryull micratanovelalgorithmforinferringgeneregulatorynetworksusingtimeseriesgeneexpressiondata
AT gongping micratanovelalgorithmforinferringgeneregulatorynetworksusingtimeseriesgeneexpressiondata
AT zhangchaoyang micratanovelalgorithmforinferringgeneregulatorynetworksusingtimeseriesgeneexpressiondata