Cargando…

Recursive random forest algorithm for constructing multilayered hierarchical gene regulatory networks that govern biological pathways

BACKGROUND: Present knowledge indicates a multilayered hierarchical gene regulatory network (ML-hGRN) often operates above a biological pathway. Although the ML-hGRN is very important for understanding how a pathway is regulated, there is almost no computational algorithm for directly constructing M...

Descripción completa

Detalles Bibliográficos
Autores principales: Deng, Wenping, Zhang, Kui, Busov, Victor, Wei, Hairong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5291523/
https://www.ncbi.nlm.nih.gov/pubmed/28158291
http://dx.doi.org/10.1371/journal.pone.0171532
_version_ 1782504793996001280
author Deng, Wenping
Zhang, Kui
Busov, Victor
Wei, Hairong
author_facet Deng, Wenping
Zhang, Kui
Busov, Victor
Wei, Hairong
author_sort Deng, Wenping
collection PubMed
description BACKGROUND: Present knowledge indicates a multilayered hierarchical gene regulatory network (ML-hGRN) often operates above a biological pathway. Although the ML-hGRN is very important for understanding how a pathway is regulated, there is almost no computational algorithm for directly constructing ML-hGRNs. RESULTS: A backward elimination random forest (BWERF) algorithm was developed for constructing the ML-hGRN operating above a biological pathway. For each pathway gene, the BWERF used a random forest model to calculate the importance values of all transcription factors (TFs) to this pathway gene recursively with a portion (e.g. 1/10) of least important TFs being excluded in each round of modeling, during which, the importance values of all TFs to the pathway gene were updated and ranked until only one TF was remained in the list. The above procedure, termed BWERF. After that, the importance values of a TF to all pathway genes were aggregated and fitted to a Gaussian mixture model to determine the TF retention for the regulatory layer immediately above the pathway layer. The acquired TFs at the secondary layer were then set to be the new bottom layer to infer the next upper layer, and this process was repeated until a ML-hGRN with the expected layers was obtained. CONCLUSIONS: BWERF improved the accuracy for constructing ML-hGRNs because it used backward elimination to exclude the noise genes, and aggregated the individual importance values for determining the TFs retention. We validated the BWERF by using it for constructing ML-hGRNs operating above mouse pluripotency maintenance pathway and Arabidopsis lignocellulosic pathway. Compared to GENIE3, BWERF showed an improvement in recognizing authentic TFs regulating a pathway. Compared to the bottom-up Gaussian graphical model algorithm we developed for constructing ML-hGRNs, the BWERF can construct ML-hGRNs with significantly reduced edges that enable biologists to choose the implicit edges for experimental validation.
format Online
Article
Text
id pubmed-5291523
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-52915232017-02-17 Recursive random forest algorithm for constructing multilayered hierarchical gene regulatory networks that govern biological pathways Deng, Wenping Zhang, Kui Busov, Victor Wei, Hairong PLoS One Research Article BACKGROUND: Present knowledge indicates a multilayered hierarchical gene regulatory network (ML-hGRN) often operates above a biological pathway. Although the ML-hGRN is very important for understanding how a pathway is regulated, there is almost no computational algorithm for directly constructing ML-hGRNs. RESULTS: A backward elimination random forest (BWERF) algorithm was developed for constructing the ML-hGRN operating above a biological pathway. For each pathway gene, the BWERF used a random forest model to calculate the importance values of all transcription factors (TFs) to this pathway gene recursively with a portion (e.g. 1/10) of least important TFs being excluded in each round of modeling, during which, the importance values of all TFs to the pathway gene were updated and ranked until only one TF was remained in the list. The above procedure, termed BWERF. After that, the importance values of a TF to all pathway genes were aggregated and fitted to a Gaussian mixture model to determine the TF retention for the regulatory layer immediately above the pathway layer. The acquired TFs at the secondary layer were then set to be the new bottom layer to infer the next upper layer, and this process was repeated until a ML-hGRN with the expected layers was obtained. CONCLUSIONS: BWERF improved the accuracy for constructing ML-hGRNs because it used backward elimination to exclude the noise genes, and aggregated the individual importance values for determining the TFs retention. We validated the BWERF by using it for constructing ML-hGRNs operating above mouse pluripotency maintenance pathway and Arabidopsis lignocellulosic pathway. Compared to GENIE3, BWERF showed an improvement in recognizing authentic TFs regulating a pathway. Compared to the bottom-up Gaussian graphical model algorithm we developed for constructing ML-hGRNs, the BWERF can construct ML-hGRNs with significantly reduced edges that enable biologists to choose the implicit edges for experimental validation. Public Library of Science 2017-02-03 /pmc/articles/PMC5291523/ /pubmed/28158291 http://dx.doi.org/10.1371/journal.pone.0171532 Text en © 2017 Deng et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Deng, Wenping
Zhang, Kui
Busov, Victor
Wei, Hairong
Recursive random forest algorithm for constructing multilayered hierarchical gene regulatory networks that govern biological pathways
title Recursive random forest algorithm for constructing multilayered hierarchical gene regulatory networks that govern biological pathways
title_full Recursive random forest algorithm for constructing multilayered hierarchical gene regulatory networks that govern biological pathways
title_fullStr Recursive random forest algorithm for constructing multilayered hierarchical gene regulatory networks that govern biological pathways
title_full_unstemmed Recursive random forest algorithm for constructing multilayered hierarchical gene regulatory networks that govern biological pathways
title_short Recursive random forest algorithm for constructing multilayered hierarchical gene regulatory networks that govern biological pathways
title_sort recursive random forest algorithm for constructing multilayered hierarchical gene regulatory networks that govern biological pathways
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5291523/
https://www.ncbi.nlm.nih.gov/pubmed/28158291
http://dx.doi.org/10.1371/journal.pone.0171532
work_keys_str_mv AT dengwenping recursiverandomforestalgorithmforconstructingmultilayeredhierarchicalgeneregulatorynetworksthatgovernbiologicalpathways
AT zhangkui recursiverandomforestalgorithmforconstructingmultilayeredhierarchicalgeneregulatorynetworksthatgovernbiologicalpathways
AT busovvictor recursiverandomforestalgorithmforconstructingmultilayeredhierarchicalgeneregulatorynetworksthatgovernbiologicalpathways
AT weihairong recursiverandomforestalgorithmforconstructingmultilayeredhierarchicalgeneregulatorynetworksthatgovernbiologicalpathways