Cargando…

Testing the additional predictive value of high-dimensional molecular data

BACKGROUND: While high-dimensional molecular data such as microarray gene expression data have been used for disease outcome prediction or diagnosis purposes for about ten years in biomedical research, the question of the additional predictive value of such data given that classical predictors are a...

Descripción completa

Detalles Bibliográficos
Autores principales: Boulesteix, Anne-Laure, Hothorn, Torsten
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2837029/
https://www.ncbi.nlm.nih.gov/pubmed/20144191
http://dx.doi.org/10.1186/1471-2105-11-78
_version_ 1782178763509858304
author Boulesteix, Anne-Laure
Hothorn, Torsten
author_facet Boulesteix, Anne-Laure
Hothorn, Torsten
author_sort Boulesteix, Anne-Laure
collection PubMed
description BACKGROUND: While high-dimensional molecular data such as microarray gene expression data have been used for disease outcome prediction or diagnosis purposes for about ten years in biomedical research, the question of the additional predictive value of such data given that classical predictors are already available has long been under-considered in the bioinformatics literature. RESULTS: We suggest an intuitive permutation-based testing procedure for assessing the additional predictive value of high-dimensional molecular data. Our method combines two well-known statistical tools: logistic regression and boosting regression. We give clear advice for the choice of the only method parameter (the number of boosting iterations). In simulations, our novel approach is found to have very good power in different settings, e.g. few strong predictors or many weak predictors. For illustrative purpose, it is applied to the two publicly available cancer data sets. CONCLUSIONS: Our simple and computationally efficient approach can be used to globally assess the additional predictive power of a large number of candidate predictors given that a few clinical covariates or a known prognostic index are already available. It is implemented in the R package "globalboosttest" which is publicly available from R-forge and will be sent to the CRAN as soon as possible.
format Text
id pubmed-2837029
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-28370292010-03-12 Testing the additional predictive value of high-dimensional molecular data Boulesteix, Anne-Laure Hothorn, Torsten BMC Bioinformatics Methodology article BACKGROUND: While high-dimensional molecular data such as microarray gene expression data have been used for disease outcome prediction or diagnosis purposes for about ten years in biomedical research, the question of the additional predictive value of such data given that classical predictors are already available has long been under-considered in the bioinformatics literature. RESULTS: We suggest an intuitive permutation-based testing procedure for assessing the additional predictive value of high-dimensional molecular data. Our method combines two well-known statistical tools: logistic regression and boosting regression. We give clear advice for the choice of the only method parameter (the number of boosting iterations). In simulations, our novel approach is found to have very good power in different settings, e.g. few strong predictors or many weak predictors. For illustrative purpose, it is applied to the two publicly available cancer data sets. CONCLUSIONS: Our simple and computationally efficient approach can be used to globally assess the additional predictive power of a large number of candidate predictors given that a few clinical covariates or a known prognostic index are already available. It is implemented in the R package "globalboosttest" which is publicly available from R-forge and will be sent to the CRAN as soon as possible. BioMed Central 2010-02-08 /pmc/articles/PMC2837029/ /pubmed/20144191 http://dx.doi.org/10.1186/1471-2105-11-78 Text en Copyright ©2010 Boulesteix and Hothorn; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology article
Boulesteix, Anne-Laure
Hothorn, Torsten
Testing the additional predictive value of high-dimensional molecular data
title Testing the additional predictive value of high-dimensional molecular data
title_full Testing the additional predictive value of high-dimensional molecular data
title_fullStr Testing the additional predictive value of high-dimensional molecular data
title_full_unstemmed Testing the additional predictive value of high-dimensional molecular data
title_short Testing the additional predictive value of high-dimensional molecular data
title_sort testing the additional predictive value of high-dimensional molecular data
topic Methodology article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2837029/
https://www.ncbi.nlm.nih.gov/pubmed/20144191
http://dx.doi.org/10.1186/1471-2105-11-78
work_keys_str_mv AT boulesteixannelaure testingtheadditionalpredictivevalueofhighdimensionalmoleculardata
AT hothorntorsten testingtheadditionalpredictivevalueofhighdimensionalmoleculardata