Cargando…
Evaluating the performance of random forest and iterative random forest based methods when applied to gene expression data
Gene-to-gene networks, such as Gene Regulatory Networks (GRN) and Predictive Expression Networks (PEN) capture relationships between genes and are beneficial for use in downstream biological analyses. There exists multiple network inference tools to produce these gene-to-gene networks from matrices...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Research Network of Computational and Structural Biotechnology
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9260260/ https://www.ncbi.nlm.nih.gov/pubmed/35832622 http://dx.doi.org/10.1016/j.csbj.2022.06.037 |
_version_ | 1784741983693045760 |
---|---|
author | Walker, Angelica M. Cliff, Ashley Romero, Jonathon Shah, Manesh B. Jones, Piet Felipe Machado Gazolla, Joao Gabriel Jacobson, Daniel A Kainer, David |
author_facet | Walker, Angelica M. Cliff, Ashley Romero, Jonathon Shah, Manesh B. Jones, Piet Felipe Machado Gazolla, Joao Gabriel Jacobson, Daniel A Kainer, David |
author_sort | Walker, Angelica M. |
collection | PubMed |
description | Gene-to-gene networks, such as Gene Regulatory Networks (GRN) and Predictive Expression Networks (PEN) capture relationships between genes and are beneficial for use in downstream biological analyses. There exists multiple network inference tools to produce these gene-to-gene networks from matrices of gene expression data. Random Forest-Leave One Out Prediction (RF-LOOP) is a method that has been shown to be efficient at producing these gene-to-gene networks, frequently known as GEne Network Inference with Ensemble of trees (GENIE3). Random Forest can be replaced in this process by iterative Random Forest (iRF), which performs variable selection and boosting. Here we validate that iterative Random Forest-Leave One Out Prediction (iRF-LOOP) produces higher quality networks than GENIE3 (RF-LOOP). We use both synthetic and empirical networks from the Dialogue for Reverse Engineering Assessment and Methods (DREAM) Challenges by Sage Bionetworks, as well as two additional empirical networks created from Arabidopsis thaliana and Populus trichocarpa expression data. |
format | Online Article Text |
id | pubmed-9260260 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Research Network of Computational and Structural Biotechnology |
record_format | MEDLINE/PubMed |
spelling | pubmed-92602602022-07-12 Evaluating the performance of random forest and iterative random forest based methods when applied to gene expression data Walker, Angelica M. Cliff, Ashley Romero, Jonathon Shah, Manesh B. Jones, Piet Felipe Machado Gazolla, Joao Gabriel Jacobson, Daniel A Kainer, David Comput Struct Biotechnol J Research Article Gene-to-gene networks, such as Gene Regulatory Networks (GRN) and Predictive Expression Networks (PEN) capture relationships between genes and are beneficial for use in downstream biological analyses. There exists multiple network inference tools to produce these gene-to-gene networks from matrices of gene expression data. Random Forest-Leave One Out Prediction (RF-LOOP) is a method that has been shown to be efficient at producing these gene-to-gene networks, frequently known as GEne Network Inference with Ensemble of trees (GENIE3). Random Forest can be replaced in this process by iterative Random Forest (iRF), which performs variable selection and boosting. Here we validate that iterative Random Forest-Leave One Out Prediction (iRF-LOOP) produces higher quality networks than GENIE3 (RF-LOOP). We use both synthetic and empirical networks from the Dialogue for Reverse Engineering Assessment and Methods (DREAM) Challenges by Sage Bionetworks, as well as two additional empirical networks created from Arabidopsis thaliana and Populus trichocarpa expression data. Research Network of Computational and Structural Biotechnology 2022-06-22 /pmc/articles/PMC9260260/ /pubmed/35832622 http://dx.doi.org/10.1016/j.csbj.2022.06.037 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Research Article Walker, Angelica M. Cliff, Ashley Romero, Jonathon Shah, Manesh B. Jones, Piet Felipe Machado Gazolla, Joao Gabriel Jacobson, Daniel A Kainer, David Evaluating the performance of random forest and iterative random forest based methods when applied to gene expression data |
title | Evaluating the performance of random forest and iterative random forest based methods when applied to gene expression data |
title_full | Evaluating the performance of random forest and iterative random forest based methods when applied to gene expression data |
title_fullStr | Evaluating the performance of random forest and iterative random forest based methods when applied to gene expression data |
title_full_unstemmed | Evaluating the performance of random forest and iterative random forest based methods when applied to gene expression data |
title_short | Evaluating the performance of random forest and iterative random forest based methods when applied to gene expression data |
title_sort | evaluating the performance of random forest and iterative random forest based methods when applied to gene expression data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9260260/ https://www.ncbi.nlm.nih.gov/pubmed/35832622 http://dx.doi.org/10.1016/j.csbj.2022.06.037 |
work_keys_str_mv | AT walkerangelicam evaluatingtheperformanceofrandomforestanditerativerandomforestbasedmethodswhenappliedtogeneexpressiondata AT cliffashley evaluatingtheperformanceofrandomforestanditerativerandomforestbasedmethodswhenappliedtogeneexpressiondata AT romerojonathon evaluatingtheperformanceofrandomforestanditerativerandomforestbasedmethodswhenappliedtogeneexpressiondata AT shahmaneshb evaluatingtheperformanceofrandomforestanditerativerandomforestbasedmethodswhenappliedtogeneexpressiondata AT jonespiet evaluatingtheperformanceofrandomforestanditerativerandomforestbasedmethodswhenappliedtogeneexpressiondata AT felipemachadogazollajoaogabriel evaluatingtheperformanceofrandomforestanditerativerandomforestbasedmethodswhenappliedtogeneexpressiondata AT jacobsondaniela evaluatingtheperformanceofrandomforestanditerativerandomforestbasedmethodswhenappliedtogeneexpressiondata AT kainerdavid evaluatingtheperformanceofrandomforestanditerativerandomforestbasedmethodswhenappliedtogeneexpressiondata |