Cargando…

Predictive modelling using neuroimaging data in the presence of confounds

When training predictive models from neuroimaging data, we typically have available non-imaging variables such as age and gender that affect the imaging data but which we may be uninterested in from a clinical perspective. Such variables are commonly referred to as ‘confounds’. In this work, we firs...

Descripción completa

Detalles Bibliográficos
Autores principales:	Rao, Anil, Monteiro, Joao M., Mourao-Miranda, Janaina
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Academic Press 2017
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5391990/ https://www.ncbi.nlm.nih.gov/pubmed/28143776 http://dx.doi.org/10.1016/j.neuroimage.2017.01.066

_version_	1783229379046801408
author	Rao, Anil Monteiro, Joao M. Mourao-Miranda, Janaina
author_facet	Rao, Anil Monteiro, Joao M. Mourao-Miranda, Janaina
author_sort	Rao, Anil
collection	PubMed
description	When training predictive models from neuroimaging data, we typically have available non-imaging variables such as age and gender that affect the imaging data but which we may be uninterested in from a clinical perspective. Such variables are commonly referred to as ‘confounds’. In this work, we firstly give a working definition for confound in the context of training predictive models from samples of neuroimaging data. We define a confound as a variable which affects the imaging data and has an association with the target variable in the sample that differs from that in the population-of-interest, i.e., the population over which we intend to apply the estimated predictive model. The focus of this paper is the scenario in which the confound and target variable are independent in the population-of-interest, but the training sample is biased due to a sample association between the target and confound. We then discuss standard approaches for dealing with confounds in predictive modelling such as image adjustment and including the confound as a predictor, before deriving and motivating an Instance Weighting scheme that attempts to account for confounds by focusing model training so that it is optimal for the population-of-interest. We evaluate the standard approaches and Instance Weighting in two regression problems with neuroimaging data in which we train models in the presence of confounding, and predict samples that are representative of the population-of-interest. For comparison, these models are also evaluated when there is no confounding present. In the first experiment we predict the MMSE score using structural MRI from the ADNI database with gender as the confound, while in the second we predict age using structural MRI from the IXI database with acquisition site as the confound. Considered over both datasets we find that none of the methods for dealing with confounding gives more accurate predictions than a baseline model which ignores confounding, although including the confound as a predictor gives models that are less accurate than the baseline model. We do find, however, that different methods appear to focus their predictions on specific subsets of the population-of-interest, and that predictive accuracy is greater when there is no confounding present. We conclude with a discussion comparing the advantages and disadvantages of each approach, and the implications of our evaluation for building predictive models that can be used in clinical practice.
format	Online Article Text
id	pubmed-5391990
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	Academic Press
record_format	MEDLINE/PubMed
spelling	pubmed-53919902017-04-18 Predictive modelling using neuroimaging data in the presence of confounds Rao, Anil Monteiro, Joao M. Mourao-Miranda, Janaina Neuroimage Article When training predictive models from neuroimaging data, we typically have available non-imaging variables such as age and gender that affect the imaging data but which we may be uninterested in from a clinical perspective. Such variables are commonly referred to as ‘confounds’. In this work, we firstly give a working definition for confound in the context of training predictive models from samples of neuroimaging data. We define a confound as a variable which affects the imaging data and has an association with the target variable in the sample that differs from that in the population-of-interest, i.e., the population over which we intend to apply the estimated predictive model. The focus of this paper is the scenario in which the confound and target variable are independent in the population-of-interest, but the training sample is biased due to a sample association between the target and confound. We then discuss standard approaches for dealing with confounds in predictive modelling such as image adjustment and including the confound as a predictor, before deriving and motivating an Instance Weighting scheme that attempts to account for confounds by focusing model training so that it is optimal for the population-of-interest. We evaluate the standard approaches and Instance Weighting in two regression problems with neuroimaging data in which we train models in the presence of confounding, and predict samples that are representative of the population-of-interest. For comparison, these models are also evaluated when there is no confounding present. In the first experiment we predict the MMSE score using structural MRI from the ADNI database with gender as the confound, while in the second we predict age using structural MRI from the IXI database with acquisition site as the confound. Considered over both datasets we find that none of the methods for dealing with confounding gives more accurate predictions than a baseline model which ignores confounding, although including the confound as a predictor gives models that are less accurate than the baseline model. We do find, however, that different methods appear to focus their predictions on specific subsets of the population-of-interest, and that predictive accuracy is greater when there is no confounding present. We conclude with a discussion comparing the advantages and disadvantages of each approach, and the implications of our evaluation for building predictive models that can be used in clinical practice. Academic Press 2017-04-15 /pmc/articles/PMC5391990/ /pubmed/28143776 http://dx.doi.org/10.1016/j.neuroimage.2017.01.066 Text en © 2017 The Authors http://creativecommons.org/licenses/by/4.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Rao, Anil Monteiro, Joao M. Mourao-Miranda, Janaina Predictive modelling using neuroimaging data in the presence of confounds
title	Predictive modelling using neuroimaging data in the presence of confounds
title_full	Predictive modelling using neuroimaging data in the presence of confounds
title_fullStr	Predictive modelling using neuroimaging data in the presence of confounds
title_full_unstemmed	Predictive modelling using neuroimaging data in the presence of confounds
title_short	Predictive modelling using neuroimaging data in the presence of confounds
title_sort	predictive modelling using neuroimaging data in the presence of confounds
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5391990/ https://www.ncbi.nlm.nih.gov/pubmed/28143776 http://dx.doi.org/10.1016/j.neuroimage.2017.01.066
work_keys_str_mv	AT raoanil predictivemodellingusingneuroimagingdatainthepresenceofconfounds AT monteirojoaom predictivemodellingusingneuroimagingdatainthepresenceofconfounds AT mouraomirandajanaina predictivemodellingusingneuroimagingdatainthepresenceofconfounds AT predictivemodellingusingneuroimagingdatainthepresenceofconfounds

Predictive modelling using neuroimaging data in the presence of confounds

Ejemplares similares