Cargando…
Influence of Statistical Estimators of Mutual Information and Data Heterogeneity on the Inference of Gene Regulatory Networks
The inference of gene regulatory networks from gene expression data is a difficult problem because the performance of the inference algorithms depends on a multitude of different factors. In this paper we study two of these. First, we investigate the influence of discrete mutual information (MI) est...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3248437/ https://www.ncbi.nlm.nih.gov/pubmed/22242113 http://dx.doi.org/10.1371/journal.pone.0029279 |
_version_ | 1782220243273252864 |
---|---|
author | de Matos Simoes, Ricardo Emmert-Streib, Frank |
author_facet | de Matos Simoes, Ricardo Emmert-Streib, Frank |
author_sort | de Matos Simoes, Ricardo |
collection | PubMed |
description | The inference of gene regulatory networks from gene expression data is a difficult problem because the performance of the inference algorithms depends on a multitude of different factors. In this paper we study two of these. First, we investigate the influence of discrete mutual information (MI) estimators on the global and local network inference performance of the C3NET algorithm. More precisely, we study [Image: see text] different MI estimators (Empirical, Miller-Madow, Shrink and Schürmann-Grassberger) in combination with [Image: see text] discretization methods (equal frequency, equal width and global equal width discretization). We observe the best global and local inference performance of C3NET for the Miller-Madow estimator with an equal width discretization. Second, our numerical analysis can be considered as a systems approach because we simulate gene expression data from an underlying gene regulatory network, instead of making a distributional assumption to sample thereof. We demonstrate that despite the popularity of the latter approach, which is the traditional way of studying MI estimators, this is in fact not supported by simulated and biological expression data because of their heterogeneity. Hence, our study provides guidance for an efficient design of a simulation study in the context of network inference, supporting a systems approach. |
format | Online Article Text |
id | pubmed-3248437 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-32484372012-01-12 Influence of Statistical Estimators of Mutual Information and Data Heterogeneity on the Inference of Gene Regulatory Networks de Matos Simoes, Ricardo Emmert-Streib, Frank PLoS One Research Article The inference of gene regulatory networks from gene expression data is a difficult problem because the performance of the inference algorithms depends on a multitude of different factors. In this paper we study two of these. First, we investigate the influence of discrete mutual information (MI) estimators on the global and local network inference performance of the C3NET algorithm. More precisely, we study [Image: see text] different MI estimators (Empirical, Miller-Madow, Shrink and Schürmann-Grassberger) in combination with [Image: see text] discretization methods (equal frequency, equal width and global equal width discretization). We observe the best global and local inference performance of C3NET for the Miller-Madow estimator with an equal width discretization. Second, our numerical analysis can be considered as a systems approach because we simulate gene expression data from an underlying gene regulatory network, instead of making a distributional assumption to sample thereof. We demonstrate that despite the popularity of the latter approach, which is the traditional way of studying MI estimators, this is in fact not supported by simulated and biological expression data because of their heterogeneity. Hence, our study provides guidance for an efficient design of a simulation study in the context of network inference, supporting a systems approach. Public Library of Science 2011-12-29 /pmc/articles/PMC3248437/ /pubmed/22242113 http://dx.doi.org/10.1371/journal.pone.0029279 Text en de Matos Simoes, Emmert-Streib. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article de Matos Simoes, Ricardo Emmert-Streib, Frank Influence of Statistical Estimators of Mutual Information and Data Heterogeneity on the Inference of Gene Regulatory Networks |
title | Influence of Statistical Estimators of Mutual Information and Data Heterogeneity on the Inference of Gene Regulatory Networks |
title_full | Influence of Statistical Estimators of Mutual Information and Data Heterogeneity on the Inference of Gene Regulatory Networks |
title_fullStr | Influence of Statistical Estimators of Mutual Information and Data Heterogeneity on the Inference of Gene Regulatory Networks |
title_full_unstemmed | Influence of Statistical Estimators of Mutual Information and Data Heterogeneity on the Inference of Gene Regulatory Networks |
title_short | Influence of Statistical Estimators of Mutual Information and Data Heterogeneity on the Inference of Gene Regulatory Networks |
title_sort | influence of statistical estimators of mutual information and data heterogeneity on the inference of gene regulatory networks |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3248437/ https://www.ncbi.nlm.nih.gov/pubmed/22242113 http://dx.doi.org/10.1371/journal.pone.0029279 |
work_keys_str_mv | AT dematossimoesricardo influenceofstatisticalestimatorsofmutualinformationanddataheterogeneityontheinferenceofgeneregulatorynetworks AT emmertstreibfrank influenceofstatisticalestimatorsofmutualinformationanddataheterogeneityontheinferenceofgeneregulatorynetworks |