Cargando…

Do little interactions get lost in dark random forests?

BACKGROUND: Random forests have often been claimed to uncover interaction effects. However, if and how interaction effects can be differentiated from marginal effects remains unclear. In extensive simulation studies, we investigate whether random forest variable importance measures capture or detect...

Descripción completa

Detalles Bibliográficos
Autores principales: Wright, Marvin N., Ziegler, Andreas, König, Inke R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4815164/
https://www.ncbi.nlm.nih.gov/pubmed/27029549
http://dx.doi.org/10.1186/s12859-016-0995-8
_version_ 1782424552291172352
author Wright, Marvin N.
Ziegler, Andreas
König, Inke R.
author_facet Wright, Marvin N.
Ziegler, Andreas
König, Inke R.
author_sort Wright, Marvin N.
collection PubMed
description BACKGROUND: Random forests have often been claimed to uncover interaction effects. However, if and how interaction effects can be differentiated from marginal effects remains unclear. In extensive simulation studies, we investigate whether random forest variable importance measures capture or detect gene-gene interactions. With capturing interactions, we define the ability to identify a variable that acts through an interaction with another one, while detection is the ability to identify an interaction effect as such. RESULTS: Of the single importance measures, the Gini importance captured interaction effects in most of the simulated scenarios, however, they were masked by marginal effects in other variables. With the permutation importance, the proportion of captured interactions was lower in all cases. Pairwise importance measures performed about equal, with a slight advantage for the joint variable importance method. However, the overall fraction of detected interactions was low. In almost all scenarios the detection fraction in a model with only marginal effects was larger than in a model with an interaction effect only. CONCLUSIONS: Random forests are generally capable of capturing gene-gene interactions, but current variable importance measures are unable to detect them as interactions. In most of the cases, interactions are masked by marginal effects and interactions cannot be differentiated from marginal effects. Consequently, caution is warranted when claiming that random forests uncover interactions. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-0995-8) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4815164
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-48151642016-04-01 Do little interactions get lost in dark random forests? Wright, Marvin N. Ziegler, Andreas König, Inke R. BMC Bioinformatics Research Article BACKGROUND: Random forests have often been claimed to uncover interaction effects. However, if and how interaction effects can be differentiated from marginal effects remains unclear. In extensive simulation studies, we investigate whether random forest variable importance measures capture or detect gene-gene interactions. With capturing interactions, we define the ability to identify a variable that acts through an interaction with another one, while detection is the ability to identify an interaction effect as such. RESULTS: Of the single importance measures, the Gini importance captured interaction effects in most of the simulated scenarios, however, they were masked by marginal effects in other variables. With the permutation importance, the proportion of captured interactions was lower in all cases. Pairwise importance measures performed about equal, with a slight advantage for the joint variable importance method. However, the overall fraction of detected interactions was low. In almost all scenarios the detection fraction in a model with only marginal effects was larger than in a model with an interaction effect only. CONCLUSIONS: Random forests are generally capable of capturing gene-gene interactions, but current variable importance measures are unable to detect them as interactions. In most of the cases, interactions are masked by marginal effects and interactions cannot be differentiated from marginal effects. Consequently, caution is warranted when claiming that random forests uncover interactions. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-0995-8) contains supplementary material, which is available to authorized users. BioMed Central 2016-03-31 /pmc/articles/PMC4815164/ /pubmed/27029549 http://dx.doi.org/10.1186/s12859-016-0995-8 Text en © Wright et al. 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Wright, Marvin N.
Ziegler, Andreas
König, Inke R.
Do little interactions get lost in dark random forests?
title Do little interactions get lost in dark random forests?
title_full Do little interactions get lost in dark random forests?
title_fullStr Do little interactions get lost in dark random forests?
title_full_unstemmed Do little interactions get lost in dark random forests?
title_short Do little interactions get lost in dark random forests?
title_sort do little interactions get lost in dark random forests?
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4815164/
https://www.ncbi.nlm.nih.gov/pubmed/27029549
http://dx.doi.org/10.1186/s12859-016-0995-8
work_keys_str_mv AT wrightmarvinn dolittleinteractionsgetlostindarkrandomforests
AT zieglerandreas dolittleinteractionsgetlostindarkrandomforests
AT koniginker dolittleinteractionsgetlostindarkrandomforests