Cargando…

Time lagged information theoretic approaches to the reverse engineering of gene regulatory networks

BACKGROUND: A number of models and algorithms have been proposed in the past for gene regulatory network (GRN) inference; however, none of them address the effects of the size of time-series microarray expression data in terms of the number of time-points. In this paper, we study this problem by ana...

Descripción completa

Detalles Bibliográficos
Autores principales: Chaitankar, Vijender, Ghosh, Preetam, Perkins, Edward J, Gong, Ping, Zhang, Chaoyang
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3026366/
https://www.ncbi.nlm.nih.gov/pubmed/20946602
http://dx.doi.org/10.1186/1471-2105-11-S6-S19
_version_ 1782197036272058368
author Chaitankar, Vijender
Ghosh, Preetam
Perkins, Edward J
Gong, Ping
Zhang, Chaoyang
author_facet Chaitankar, Vijender
Ghosh, Preetam
Perkins, Edward J
Gong, Ping
Zhang, Chaoyang
author_sort Chaitankar, Vijender
collection PubMed
description BACKGROUND: A number of models and algorithms have been proposed in the past for gene regulatory network (GRN) inference; however, none of them address the effects of the size of time-series microarray expression data in terms of the number of time-points. In this paper, we study this problem by analyzing the behaviour of three algorithms based on information theory and dynamic Bayesian network (DBN) models. These algorithms were implemented on different sizes of data generated by synthetic networks. Experiments show that the inference accuracy of these algorithms reaches a saturation point after a specific data size brought about by a saturation in the pair-wise mutual information (MI) metric; hence there is a theoretical limit on the inference accuracy of information theory based schemes that depends on the number of time points of micro-array data used to infer GRNs. This illustrates the fact that MI might not be the best metric to use for GRN inference algorithms. To circumvent the limitations of the MI metric, we introduce a new method of computing time lags between any pair of genes and present the pair-wise time lagged Mutual Information (TLMI) and time lagged Conditional Mutual Information (TLCMI) metrics. Next we use these new metrics to propose novel GRN inference schemes which provides higher inference accuracy based on the precision and recall parameters. RESULTS: It was observed that beyond a certain number of time-points (i.e., a specific size) of micro-array data, the performance of the algorithms measured in terms of the recall-to-precision ratio saturated due to the saturation in the calculated pair-wise MI metric with increasing data size. The proposed algorithms were compared to existing approaches on four different biological networks. The resulting networks were evaluated based on the benchmark precision and recall metrics and the results favour our approach. CONCLUSIONS: To alleviate the effects of data size on information theory based GRN inference algorithms, novel time lag based information theoretic approaches to infer gene regulatory networks have been proposed. The results show that the time lags of regulatory effects between any pair of genes play an important role in GRN inference schemes.
format Text
id pubmed-3026366
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30263662011-01-26 Time lagged information theoretic approaches to the reverse engineering of gene regulatory networks Chaitankar, Vijender Ghosh, Preetam Perkins, Edward J Gong, Ping Zhang, Chaoyang BMC Bioinformatics Proceedings BACKGROUND: A number of models and algorithms have been proposed in the past for gene regulatory network (GRN) inference; however, none of them address the effects of the size of time-series microarray expression data in terms of the number of time-points. In this paper, we study this problem by analyzing the behaviour of three algorithms based on information theory and dynamic Bayesian network (DBN) models. These algorithms were implemented on different sizes of data generated by synthetic networks. Experiments show that the inference accuracy of these algorithms reaches a saturation point after a specific data size brought about by a saturation in the pair-wise mutual information (MI) metric; hence there is a theoretical limit on the inference accuracy of information theory based schemes that depends on the number of time points of micro-array data used to infer GRNs. This illustrates the fact that MI might not be the best metric to use for GRN inference algorithms. To circumvent the limitations of the MI metric, we introduce a new method of computing time lags between any pair of genes and present the pair-wise time lagged Mutual Information (TLMI) and time lagged Conditional Mutual Information (TLCMI) metrics. Next we use these new metrics to propose novel GRN inference schemes which provides higher inference accuracy based on the precision and recall parameters. RESULTS: It was observed that beyond a certain number of time-points (i.e., a specific size) of micro-array data, the performance of the algorithms measured in terms of the recall-to-precision ratio saturated due to the saturation in the calculated pair-wise MI metric with increasing data size. The proposed algorithms were compared to existing approaches on four different biological networks. The resulting networks were evaluated based on the benchmark precision and recall metrics and the results favour our approach. CONCLUSIONS: To alleviate the effects of data size on information theory based GRN inference algorithms, novel time lag based information theoretic approaches to infer gene regulatory networks have been proposed. The results show that the time lags of regulatory effects between any pair of genes play an important role in GRN inference schemes. BioMed Central 2010-10-07 /pmc/articles/PMC3026366/ /pubmed/20946602 http://dx.doi.org/10.1186/1471-2105-11-S6-S19 Text en Copyright ©2010 Zhang et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Chaitankar, Vijender
Ghosh, Preetam
Perkins, Edward J
Gong, Ping
Zhang, Chaoyang
Time lagged information theoretic approaches to the reverse engineering of gene regulatory networks
title Time lagged information theoretic approaches to the reverse engineering of gene regulatory networks
title_full Time lagged information theoretic approaches to the reverse engineering of gene regulatory networks
title_fullStr Time lagged information theoretic approaches to the reverse engineering of gene regulatory networks
title_full_unstemmed Time lagged information theoretic approaches to the reverse engineering of gene regulatory networks
title_short Time lagged information theoretic approaches to the reverse engineering of gene regulatory networks
title_sort time lagged information theoretic approaches to the reverse engineering of gene regulatory networks
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3026366/
https://www.ncbi.nlm.nih.gov/pubmed/20946602
http://dx.doi.org/10.1186/1471-2105-11-S6-S19
work_keys_str_mv AT chaitankarvijender timelaggedinformationtheoreticapproachestothereverseengineeringofgeneregulatorynetworks
AT ghoshpreetam timelaggedinformationtheoreticapproachestothereverseengineeringofgeneregulatorynetworks
AT perkinsedwardj timelaggedinformationtheoreticapproachestothereverseengineeringofgeneregulatorynetworks
AT gongping timelaggedinformationtheoreticapproachestothereverseengineeringofgeneregulatorynetworks
AT zhangchaoyang timelaggedinformationtheoreticapproachestothereverseengineeringofgeneregulatorynetworks