Cargando…

Controlled variable selection in Weibull mixture cure models for high‐dimensional data

Medical breakthroughs in recent years have led to cures for many diseases. The mixture cure model (MCM) is a type of survival model that is often used when a cured fraction exists. Many have sought to identify genomic features associated with a time‐to‐event outcome which requires variable selection...

Descripción completa

Detalles Bibliográficos
Autores principales: Fu, Han, Nicolet, Deedra, Mrózek, Krzysztof, Stone, Richard M., Eisfeld, Ann‐Kathrin, Byrd, John C., Archer, Kellie J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9545322/
https://www.ncbi.nlm.nih.gov/pubmed/35792553
http://dx.doi.org/10.1002/sim.9513
_version_ 1784804794952581120
author Fu, Han
Nicolet, Deedra
Mrózek, Krzysztof
Stone, Richard M.
Eisfeld, Ann‐Kathrin
Byrd, John C.
Archer, Kellie J.
author_facet Fu, Han
Nicolet, Deedra
Mrózek, Krzysztof
Stone, Richard M.
Eisfeld, Ann‐Kathrin
Byrd, John C.
Archer, Kellie J.
author_sort Fu, Han
collection PubMed
description Medical breakthroughs in recent years have led to cures for many diseases. The mixture cure model (MCM) is a type of survival model that is often used when a cured fraction exists. Many have sought to identify genomic features associated with a time‐to‐event outcome which requires variable selection strategies for high‐dimensional spaces. Unfortunately, currently few variable selection methods exist for MCMs especially when there are more predictors than samples. This study develops high‐dimensional penalized Weibull MCMs, which allow for identification of prognostic factors associated with both cure status and/or survival. We demonstrated how such models may be estimated using two different iterative algorithms. The model‐X knockoffs method was combined with these algorithms to control the false discovery rate (FDR) in variable selection. Through extensive simulation studies, our penalized MCMs have been shown to outperform alternative methods on multiple metrics and achieve high statistical power with FDR being controlled. In an acute myeloid leukemia (AML) application with gene expression data, our proposed approach identified 14 genes associated with potential cure and 12 genes with time‐to‐relapse, which may help inform treatment decisions for AML patients.
format Online
Article
Text
id pubmed-9545322
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-95453222022-10-14 Controlled variable selection in Weibull mixture cure models for high‐dimensional data Fu, Han Nicolet, Deedra Mrózek, Krzysztof Stone, Richard M. Eisfeld, Ann‐Kathrin Byrd, John C. Archer, Kellie J. Stat Med Research Articles Medical breakthroughs in recent years have led to cures for many diseases. The mixture cure model (MCM) is a type of survival model that is often used when a cured fraction exists. Many have sought to identify genomic features associated with a time‐to‐event outcome which requires variable selection strategies for high‐dimensional spaces. Unfortunately, currently few variable selection methods exist for MCMs especially when there are more predictors than samples. This study develops high‐dimensional penalized Weibull MCMs, which allow for identification of prognostic factors associated with both cure status and/or survival. We demonstrated how such models may be estimated using two different iterative algorithms. The model‐X knockoffs method was combined with these algorithms to control the false discovery rate (FDR) in variable selection. Through extensive simulation studies, our penalized MCMs have been shown to outperform alternative methods on multiple metrics and achieve high statistical power with FDR being controlled. In an acute myeloid leukemia (AML) application with gene expression data, our proposed approach identified 14 genes associated with potential cure and 12 genes with time‐to‐relapse, which may help inform treatment decisions for AML patients. John Wiley and Sons Inc. 2022-07-06 2022-09-30 /pmc/articles/PMC9545322/ /pubmed/35792553 http://dx.doi.org/10.1002/sim.9513 Text en © 2022 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc-nd/4.0/ (https://creativecommons.org/licenses/by-nc-nd/4.0/) License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made.
spellingShingle Research Articles
Fu, Han
Nicolet, Deedra
Mrózek, Krzysztof
Stone, Richard M.
Eisfeld, Ann‐Kathrin
Byrd, John C.
Archer, Kellie J.
Controlled variable selection in Weibull mixture cure models for high‐dimensional data
title Controlled variable selection in Weibull mixture cure models for high‐dimensional data
title_full Controlled variable selection in Weibull mixture cure models for high‐dimensional data
title_fullStr Controlled variable selection in Weibull mixture cure models for high‐dimensional data
title_full_unstemmed Controlled variable selection in Weibull mixture cure models for high‐dimensional data
title_short Controlled variable selection in Weibull mixture cure models for high‐dimensional data
title_sort controlled variable selection in weibull mixture cure models for high‐dimensional data
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9545322/
https://www.ncbi.nlm.nih.gov/pubmed/35792553
http://dx.doi.org/10.1002/sim.9513
work_keys_str_mv AT fuhan controlledvariableselectioninweibullmixturecuremodelsforhighdimensionaldata
AT nicoletdeedra controlledvariableselectioninweibullmixturecuremodelsforhighdimensionaldata
AT mrozekkrzysztof controlledvariableselectioninweibullmixturecuremodelsforhighdimensionaldata
AT stonerichardm controlledvariableselectioninweibullmixturecuremodelsforhighdimensionaldata
AT eisfeldannkathrin controlledvariableselectioninweibullmixturecuremodelsforhighdimensionaldata
AT byrdjohnc controlledvariableselectioninweibullmixturecuremodelsforhighdimensionaldata
AT archerkelliej controlledvariableselectioninweibullmixturecuremodelsforhighdimensionaldata