Cargando…

Weighted Cox regression for the prediction of heterogeneous patient subgroups

BACKGROUND: An important task in clinical medicine is the construction of risk prediction models for specific subgroups of patients based on high-dimensional molecular measurements such as gene expression data. Major objectives in modeling high-dimensional data are good prediction performance and fe...

Descripción completa

Detalles Bibliográficos
Autores principales: Madjar, Katrin, Rahnenführer, Jörg
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8650299/
https://www.ncbi.nlm.nih.gov/pubmed/34876106
http://dx.doi.org/10.1186/s12911-021-01698-1
Descripción
Sumario:BACKGROUND: An important task in clinical medicine is the construction of risk prediction models for specific subgroups of patients based on high-dimensional molecular measurements such as gene expression data. Major objectives in modeling high-dimensional data are good prediction performance and feature selection to find a subset of predictors that are truly associated with a clinical outcome such as a time-to-event endpoint. In clinical practice, this task is challenging since patient cohorts are typically small and can be heterogeneous with regard to their relationship between predictors and outcome. When data of several subgroups of patients with the same or similar disease are available, it is tempting to combine them to increase sample size, such as in multicenter studies. However, heterogeneity between subgroups can lead to biased results and subgroup-specific effects may remain undetected. METHODS: For this situation, we propose a penalized Cox regression model with a weighted version of the Cox partial likelihood that includes patients of all subgroups but assigns them individual weights based on their subgroup affiliation. The weights are estimated from the data such that patients who are likely to belong to the subgroup of interest obtain higher weights in the subgroup-specific model. RESULTS: Our proposed approach is evaluated through simulations and application to real lung cancer cohorts, and compared to existing approaches. Simulation results demonstrate that our proposed model is superior to standard approaches in terms of prediction performance and variable selection accuracy when the sample size is small. CONCLUSIONS: The results suggest that sharing information between subgroups by incorporating appropriate weights into the likelihood can increase power to identify the prognostic covariates and improve risk prediction.