Cargando…

Random Forests Based Group Importance Scores and Their Statistical Interpretation: Application for Alzheimer's Disease

Machine learning approaches have been increasingly used in the neuroimaging field for the design of computer-aided diagnosis systems. In this paper, we focus on the ability of these methods to provide interpretable information about the brain regions that are the most informative about the disease o...

Descripción completa

Detalles Bibliográficos
Autores principales: Wehenkel, Marie, Sutera, Antonio, Bastin, Christine, Geurts, Pierre, Phillips, Christophe
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6034092/
https://www.ncbi.nlm.nih.gov/pubmed/30008658
http://dx.doi.org/10.3389/fnins.2018.00411
_version_ 1783337809918033920
author Wehenkel, Marie
Sutera, Antonio
Bastin, Christine
Geurts, Pierre
Phillips, Christophe
author_facet Wehenkel, Marie
Sutera, Antonio
Bastin, Christine
Geurts, Pierre
Phillips, Christophe
author_sort Wehenkel, Marie
collection PubMed
description Machine learning approaches have been increasingly used in the neuroimaging field for the design of computer-aided diagnosis systems. In this paper, we focus on the ability of these methods to provide interpretable information about the brain regions that are the most informative about the disease or condition of interest. In particular, we investigate the benefit of group-based, instead of voxel-based, analyses in the context of Random Forests. Assuming a prior division of the voxels into non overlapping groups (defined by an atlas), we propose several procedures to derive group importances from individual voxel importances derived from Random Forests models. We then adapt several permutation schemes to turn group importance scores into more interpretable statistical scores that allow to determine the truly relevant groups in the importance rankings. The good behaviour of these methods is first assessed on artificial datasets. Then, they are applied on our own dataset of FDG-PET scans to identify the brain regions involved in the prognosis of Alzheimer's disease.
format Online
Article
Text
id pubmed-6034092
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-60340922018-07-13 Random Forests Based Group Importance Scores and Their Statistical Interpretation: Application for Alzheimer's Disease Wehenkel, Marie Sutera, Antonio Bastin, Christine Geurts, Pierre Phillips, Christophe Front Neurosci Neuroscience Machine learning approaches have been increasingly used in the neuroimaging field for the design of computer-aided diagnosis systems. In this paper, we focus on the ability of these methods to provide interpretable information about the brain regions that are the most informative about the disease or condition of interest. In particular, we investigate the benefit of group-based, instead of voxel-based, analyses in the context of Random Forests. Assuming a prior division of the voxels into non overlapping groups (defined by an atlas), we propose several procedures to derive group importances from individual voxel importances derived from Random Forests models. We then adapt several permutation schemes to turn group importance scores into more interpretable statistical scores that allow to determine the truly relevant groups in the importance rankings. The good behaviour of these methods is first assessed on artificial datasets. Then, they are applied on our own dataset of FDG-PET scans to identify the brain regions involved in the prognosis of Alzheimer's disease. Frontiers Media S.A. 2018-06-29 /pmc/articles/PMC6034092/ /pubmed/30008658 http://dx.doi.org/10.3389/fnins.2018.00411 Text en Copyright © 2018 Wehenkel, Sutera, Bastin, Geurts and Phillips. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Neuroscience
Wehenkel, Marie
Sutera, Antonio
Bastin, Christine
Geurts, Pierre
Phillips, Christophe
Random Forests Based Group Importance Scores and Their Statistical Interpretation: Application for Alzheimer's Disease
title Random Forests Based Group Importance Scores and Their Statistical Interpretation: Application for Alzheimer's Disease
title_full Random Forests Based Group Importance Scores and Their Statistical Interpretation: Application for Alzheimer's Disease
title_fullStr Random Forests Based Group Importance Scores and Their Statistical Interpretation: Application for Alzheimer's Disease
title_full_unstemmed Random Forests Based Group Importance Scores and Their Statistical Interpretation: Application for Alzheimer's Disease
title_short Random Forests Based Group Importance Scores and Their Statistical Interpretation: Application for Alzheimer's Disease
title_sort random forests based group importance scores and their statistical interpretation: application for alzheimer's disease
topic Neuroscience
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6034092/
https://www.ncbi.nlm.nih.gov/pubmed/30008658
http://dx.doi.org/10.3389/fnins.2018.00411
work_keys_str_mv AT wehenkelmarie randomforestsbasedgroupimportancescoresandtheirstatisticalinterpretationapplicationforalzheimersdisease
AT suteraantonio randomforestsbasedgroupimportancescoresandtheirstatisticalinterpretationapplicationforalzheimersdisease
AT bastinchristine randomforestsbasedgroupimportancescoresandtheirstatisticalinterpretationapplicationforalzheimersdisease
AT geurtspierre randomforestsbasedgroupimportancescoresandtheirstatisticalinterpretationapplicationforalzheimersdisease
AT phillipschristophe randomforestsbasedgroupimportancescoresandtheirstatisticalinterpretationapplicationforalzheimersdisease