Cargando…

Inference of Genetic Networks From Time-Series and Static Gene Expression Data: Combining a Random-Forest-Based Inference Method With Feature Selection Methods

Several researchers have focused on random-forest-based inference methods because of their excellent performance. Some of these inference methods also have a useful ability to analyze both time-series and static gene expression data. However, they are only of use in ranking all of the candidate regu...

Descripción completa

Detalles Bibliográficos
Autores principales: Kimura, Shuhei, Fukutomi, Ryo, Tokuhisa, Masato, Okada, Mariko
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7770182/
https://www.ncbi.nlm.nih.gov/pubmed/33384716
http://dx.doi.org/10.3389/fgene.2020.595912
_version_ 1783629453420658688
author Kimura, Shuhei
Fukutomi, Ryo
Tokuhisa, Masato
Okada, Mariko
author_facet Kimura, Shuhei
Fukutomi, Ryo
Tokuhisa, Masato
Okada, Mariko
author_sort Kimura, Shuhei
collection PubMed
description Several researchers have focused on random-forest-based inference methods because of their excellent performance. Some of these inference methods also have a useful ability to analyze both time-series and static gene expression data. However, they are only of use in ranking all of the candidate regulations by assigning them confidence values. None have been capable of detecting the regulations that actually affect a gene of interest. In this study, we propose a method to remove unpromising candidate regulations by combining the random-forest-based inference method with a series of feature selection methods. In addition to detecting unpromising regulations, our proposed method uses outputs from the feature selection methods to adjust the confidence values of all of the candidate regulations that have been computed by the random-forest-based inference method. Numerical experiments showed that the combined application with the feature selection methods improved the performance of the random-forest-based inference method on 99 of the 100 trials performed on the artificial problems. However, the improvement tends to be small, since our combined method succeeded in removing only 19% of the candidate regulations at most. The combined application with the feature selection methods moreover makes the computational cost higher. While a bigger improvement at a lower computational cost would be ideal, we see no impediments to our investigation, given that our aim is to extract as much useful information as possible from a limited amount of gene expression data.
format Online
Article
Text
id pubmed-7770182
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-77701822020-12-30 Inference of Genetic Networks From Time-Series and Static Gene Expression Data: Combining a Random-Forest-Based Inference Method With Feature Selection Methods Kimura, Shuhei Fukutomi, Ryo Tokuhisa, Masato Okada, Mariko Front Genet Genetics Several researchers have focused on random-forest-based inference methods because of their excellent performance. Some of these inference methods also have a useful ability to analyze both time-series and static gene expression data. However, they are only of use in ranking all of the candidate regulations by assigning them confidence values. None have been capable of detecting the regulations that actually affect a gene of interest. In this study, we propose a method to remove unpromising candidate regulations by combining the random-forest-based inference method with a series of feature selection methods. In addition to detecting unpromising regulations, our proposed method uses outputs from the feature selection methods to adjust the confidence values of all of the candidate regulations that have been computed by the random-forest-based inference method. Numerical experiments showed that the combined application with the feature selection methods improved the performance of the random-forest-based inference method on 99 of the 100 trials performed on the artificial problems. However, the improvement tends to be small, since our combined method succeeded in removing only 19% of the candidate regulations at most. The combined application with the feature selection methods moreover makes the computational cost higher. While a bigger improvement at a lower computational cost would be ideal, we see no impediments to our investigation, given that our aim is to extract as much useful information as possible from a limited amount of gene expression data. Frontiers Media S.A. 2020-12-15 /pmc/articles/PMC7770182/ /pubmed/33384716 http://dx.doi.org/10.3389/fgene.2020.595912 Text en Copyright © 2020 Kimura, Fukutomi, Tokuhisa and Okada. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Kimura, Shuhei
Fukutomi, Ryo
Tokuhisa, Masato
Okada, Mariko
Inference of Genetic Networks From Time-Series and Static Gene Expression Data: Combining a Random-Forest-Based Inference Method With Feature Selection Methods
title Inference of Genetic Networks From Time-Series and Static Gene Expression Data: Combining a Random-Forest-Based Inference Method With Feature Selection Methods
title_full Inference of Genetic Networks From Time-Series and Static Gene Expression Data: Combining a Random-Forest-Based Inference Method With Feature Selection Methods
title_fullStr Inference of Genetic Networks From Time-Series and Static Gene Expression Data: Combining a Random-Forest-Based Inference Method With Feature Selection Methods
title_full_unstemmed Inference of Genetic Networks From Time-Series and Static Gene Expression Data: Combining a Random-Forest-Based Inference Method With Feature Selection Methods
title_short Inference of Genetic Networks From Time-Series and Static Gene Expression Data: Combining a Random-Forest-Based Inference Method With Feature Selection Methods
title_sort inference of genetic networks from time-series and static gene expression data: combining a random-forest-based inference method with feature selection methods
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7770182/
https://www.ncbi.nlm.nih.gov/pubmed/33384716
http://dx.doi.org/10.3389/fgene.2020.595912
work_keys_str_mv AT kimurashuhei inferenceofgeneticnetworksfromtimeseriesandstaticgeneexpressiondatacombiningarandomforestbasedinferencemethodwithfeatureselectionmethods
AT fukutomiryo inferenceofgeneticnetworksfromtimeseriesandstaticgeneexpressiondatacombiningarandomforestbasedinferencemethodwithfeatureselectionmethods
AT tokuhisamasato inferenceofgeneticnetworksfromtimeseriesandstaticgeneexpressiondatacombiningarandomforestbasedinferencemethodwithfeatureselectionmethods
AT okadamariko inferenceofgeneticnetworksfromtimeseriesandstaticgeneexpressiondatacombiningarandomforestbasedinferencemethodwithfeatureselectionmethods