Cargando…
Recursive random forest algorithm for constructing multilayered hierarchical gene regulatory networks that govern biological pathways
BACKGROUND: Present knowledge indicates a multilayered hierarchical gene regulatory network (ML-hGRN) often operates above a biological pathway. Although the ML-hGRN is very important for understanding how a pathway is regulated, there is almost no computational algorithm for directly constructing M...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5291523/ https://www.ncbi.nlm.nih.gov/pubmed/28158291 http://dx.doi.org/10.1371/journal.pone.0171532 |
_version_ | 1782504793996001280 |
---|---|
author | Deng, Wenping Zhang, Kui Busov, Victor Wei, Hairong |
author_facet | Deng, Wenping Zhang, Kui Busov, Victor Wei, Hairong |
author_sort | Deng, Wenping |
collection | PubMed |
description | BACKGROUND: Present knowledge indicates a multilayered hierarchical gene regulatory network (ML-hGRN) often operates above a biological pathway. Although the ML-hGRN is very important for understanding how a pathway is regulated, there is almost no computational algorithm for directly constructing ML-hGRNs. RESULTS: A backward elimination random forest (BWERF) algorithm was developed for constructing the ML-hGRN operating above a biological pathway. For each pathway gene, the BWERF used a random forest model to calculate the importance values of all transcription factors (TFs) to this pathway gene recursively with a portion (e.g. 1/10) of least important TFs being excluded in each round of modeling, during which, the importance values of all TFs to the pathway gene were updated and ranked until only one TF was remained in the list. The above procedure, termed BWERF. After that, the importance values of a TF to all pathway genes were aggregated and fitted to a Gaussian mixture model to determine the TF retention for the regulatory layer immediately above the pathway layer. The acquired TFs at the secondary layer were then set to be the new bottom layer to infer the next upper layer, and this process was repeated until a ML-hGRN with the expected layers was obtained. CONCLUSIONS: BWERF improved the accuracy for constructing ML-hGRNs because it used backward elimination to exclude the noise genes, and aggregated the individual importance values for determining the TFs retention. We validated the BWERF by using it for constructing ML-hGRNs operating above mouse pluripotency maintenance pathway and Arabidopsis lignocellulosic pathway. Compared to GENIE3, BWERF showed an improvement in recognizing authentic TFs regulating a pathway. Compared to the bottom-up Gaussian graphical model algorithm we developed for constructing ML-hGRNs, the BWERF can construct ML-hGRNs with significantly reduced edges that enable biologists to choose the implicit edges for experimental validation. |
format | Online Article Text |
id | pubmed-5291523 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-52915232017-02-17 Recursive random forest algorithm for constructing multilayered hierarchical gene regulatory networks that govern biological pathways Deng, Wenping Zhang, Kui Busov, Victor Wei, Hairong PLoS One Research Article BACKGROUND: Present knowledge indicates a multilayered hierarchical gene regulatory network (ML-hGRN) often operates above a biological pathway. Although the ML-hGRN is very important for understanding how a pathway is regulated, there is almost no computational algorithm for directly constructing ML-hGRNs. RESULTS: A backward elimination random forest (BWERF) algorithm was developed for constructing the ML-hGRN operating above a biological pathway. For each pathway gene, the BWERF used a random forest model to calculate the importance values of all transcription factors (TFs) to this pathway gene recursively with a portion (e.g. 1/10) of least important TFs being excluded in each round of modeling, during which, the importance values of all TFs to the pathway gene were updated and ranked until only one TF was remained in the list. The above procedure, termed BWERF. After that, the importance values of a TF to all pathway genes were aggregated and fitted to a Gaussian mixture model to determine the TF retention for the regulatory layer immediately above the pathway layer. The acquired TFs at the secondary layer were then set to be the new bottom layer to infer the next upper layer, and this process was repeated until a ML-hGRN with the expected layers was obtained. CONCLUSIONS: BWERF improved the accuracy for constructing ML-hGRNs because it used backward elimination to exclude the noise genes, and aggregated the individual importance values for determining the TFs retention. We validated the BWERF by using it for constructing ML-hGRNs operating above mouse pluripotency maintenance pathway and Arabidopsis lignocellulosic pathway. Compared to GENIE3, BWERF showed an improvement in recognizing authentic TFs regulating a pathway. Compared to the bottom-up Gaussian graphical model algorithm we developed for constructing ML-hGRNs, the BWERF can construct ML-hGRNs with significantly reduced edges that enable biologists to choose the implicit edges for experimental validation. Public Library of Science 2017-02-03 /pmc/articles/PMC5291523/ /pubmed/28158291 http://dx.doi.org/10.1371/journal.pone.0171532 Text en © 2017 Deng et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Deng, Wenping Zhang, Kui Busov, Victor Wei, Hairong Recursive random forest algorithm for constructing multilayered hierarchical gene regulatory networks that govern biological pathways |
title | Recursive random forest algorithm for constructing multilayered hierarchical gene regulatory networks that govern biological pathways |
title_full | Recursive random forest algorithm for constructing multilayered hierarchical gene regulatory networks that govern biological pathways |
title_fullStr | Recursive random forest algorithm for constructing multilayered hierarchical gene regulatory networks that govern biological pathways |
title_full_unstemmed | Recursive random forest algorithm for constructing multilayered hierarchical gene regulatory networks that govern biological pathways |
title_short | Recursive random forest algorithm for constructing multilayered hierarchical gene regulatory networks that govern biological pathways |
title_sort | recursive random forest algorithm for constructing multilayered hierarchical gene regulatory networks that govern biological pathways |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5291523/ https://www.ncbi.nlm.nih.gov/pubmed/28158291 http://dx.doi.org/10.1371/journal.pone.0171532 |
work_keys_str_mv | AT dengwenping recursiverandomforestalgorithmforconstructingmultilayeredhierarchicalgeneregulatorynetworksthatgovernbiologicalpathways AT zhangkui recursiverandomforestalgorithmforconstructingmultilayeredhierarchicalgeneregulatorynetworksthatgovernbiologicalpathways AT busovvictor recursiverandomforestalgorithmforconstructingmultilayeredhierarchicalgeneregulatorynetworksthatgovernbiologicalpathways AT weihairong recursiverandomforestalgorithmforconstructingmultilayeredhierarchicalgeneregulatorynetworksthatgovernbiologicalpathways |