Cargando…

Incorporating pathway information into boosting estimation of high-dimensional risk prediction models

BACKGROUND: There are several techniques for fitting risk prediction models to high-dimensional data, arising from microarrays. However, the biological knowledge about relations between genes is only rarely taken into account. One recent approach incorporates pathway information, available, e.g., fr...

Descripción completa

Detalles Bibliográficos
Autores principales: Binder, Harald, Schumacher, Martin
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2647532/
https://www.ncbi.nlm.nih.gov/pubmed/19144132
http://dx.doi.org/10.1186/1471-2105-10-18
_version_ 1782164926348918784
author Binder, Harald
Schumacher, Martin
author_facet Binder, Harald
Schumacher, Martin
author_sort Binder, Harald
collection PubMed
description BACKGROUND: There are several techniques for fitting risk prediction models to high-dimensional data, arising from microarrays. However, the biological knowledge about relations between genes is only rarely taken into account. One recent approach incorporates pathway information, available, e.g., from the KEGG database, by augmenting the penalty term in Lasso estimation for continuous response models. RESULTS: As an alternative, we extend componentwise likelihood-based boosting techniques for incorporating pathway information into a larger number of model classes, such as generalized linear models and the Cox proportional hazards model for time-to-event data. In contrast to Lasso-like approaches, no further assumptions for explicitly specifying the penalty structure are needed, as pathway information is incorporated by adapting the penalties for single microarray features in the course of the boosting steps. This is shown to result in improved prediction performance when the coefficients of connected genes have opposite sign. The properties of the fitted models resulting from this approach are then investigated in two application examples with microarray survival data. CONCLUSION: The proposed approach results not only in improved prediction performance but also in structurally different model fits. Incorporating pathway information in the suggested way is therefore seen to be beneficial in several ways.
format Text
id pubmed-2647532
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26475322009-02-25 Incorporating pathway information into boosting estimation of high-dimensional risk prediction models Binder, Harald Schumacher, Martin BMC Bioinformatics Methodology Article BACKGROUND: There are several techniques for fitting risk prediction models to high-dimensional data, arising from microarrays. However, the biological knowledge about relations between genes is only rarely taken into account. One recent approach incorporates pathway information, available, e.g., from the KEGG database, by augmenting the penalty term in Lasso estimation for continuous response models. RESULTS: As an alternative, we extend componentwise likelihood-based boosting techniques for incorporating pathway information into a larger number of model classes, such as generalized linear models and the Cox proportional hazards model for time-to-event data. In contrast to Lasso-like approaches, no further assumptions for explicitly specifying the penalty structure are needed, as pathway information is incorporated by adapting the penalties for single microarray features in the course of the boosting steps. This is shown to result in improved prediction performance when the coefficients of connected genes have opposite sign. The properties of the fitted models resulting from this approach are then investigated in two application examples with microarray survival data. CONCLUSION: The proposed approach results not only in improved prediction performance but also in structurally different model fits. Incorporating pathway information in the suggested way is therefore seen to be beneficial in several ways. BioMed Central 2009-01-13 /pmc/articles/PMC2647532/ /pubmed/19144132 http://dx.doi.org/10.1186/1471-2105-10-18 Text en Copyright © 2009 Binder and Schumacher; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Binder, Harald
Schumacher, Martin
Incorporating pathway information into boosting estimation of high-dimensional risk prediction models
title Incorporating pathway information into boosting estimation of high-dimensional risk prediction models
title_full Incorporating pathway information into boosting estimation of high-dimensional risk prediction models
title_fullStr Incorporating pathway information into boosting estimation of high-dimensional risk prediction models
title_full_unstemmed Incorporating pathway information into boosting estimation of high-dimensional risk prediction models
title_short Incorporating pathway information into boosting estimation of high-dimensional risk prediction models
title_sort incorporating pathway information into boosting estimation of high-dimensional risk prediction models
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2647532/
https://www.ncbi.nlm.nih.gov/pubmed/19144132
http://dx.doi.org/10.1186/1471-2105-10-18
work_keys_str_mv AT binderharald incorporatingpathwayinformationintoboostingestimationofhighdimensionalriskpredictionmodels
AT schumachermartin incorporatingpathwayinformationintoboostingestimationofhighdimensionalriskpredictionmodels