Cargando…
SurvBenchmark: comprehensive benchmarking study of survival analysis methods using both omics data and clinical data
Survival analysis is a branch of statistics that deals with both the tracking of time and the survival status simultaneously as the dependent response. Current comparisons of survival model performance mostly center on clinical data with classic statistical survival models, with prediction accuracy...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9338425/ https://www.ncbi.nlm.nih.gov/pubmed/35906887 http://dx.doi.org/10.1093/gigascience/giac071 |
_version_ | 1784759965441851392 |
---|---|
author | Zhang, Yunwei Wong, Germaine Mann, Graham Muller, Samuel Yang, Jean Y H |
author_facet | Zhang, Yunwei Wong, Germaine Mann, Graham Muller, Samuel Yang, Jean Y H |
author_sort | Zhang, Yunwei |
collection | PubMed |
description | Survival analysis is a branch of statistics that deals with both the tracking of time and the survival status simultaneously as the dependent response. Current comparisons of survival model performance mostly center on clinical data with classic statistical survival models, with prediction accuracy often serving as the sole metric of model performance. Moreover, survival analysis approaches for censored omics data have not been thoroughly investigated. The common approach is to binarize the survival time and perform a classification analysis. Here, we develop a benchmarking design, SurvBenchmark, that evaluates a diverse collection of survival models for both clinical and omics data sets. SurvBenchmark not only focuses on classical approaches such as the Cox model but also evaluates state-of-the-art machine learning survival models. All approaches were assessed using multiple performance metrics; these include model predictability, stability, flexibility, and computational issues. Our systematic comparison design with 320 comparisons (20 methods over 16 data sets) shows that the performances of survival models vary in practice over real-world data sets and over the choice of the evaluation metric. In particular, we highlight that using multiple performance metrics is critical in providing a balanced assessment of various models. The results in our study will provide practical guidelines for translational scientists and clinicians, as well as define possible areas of investigation in both survival technique and benchmarking strategies. |
format | Online Article Text |
id | pubmed-9338425 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-93384252022-08-01 SurvBenchmark: comprehensive benchmarking study of survival analysis methods using both omics data and clinical data Zhang, Yunwei Wong, Germaine Mann, Graham Muller, Samuel Yang, Jean Y H Gigascience Research Survival analysis is a branch of statistics that deals with both the tracking of time and the survival status simultaneously as the dependent response. Current comparisons of survival model performance mostly center on clinical data with classic statistical survival models, with prediction accuracy often serving as the sole metric of model performance. Moreover, survival analysis approaches for censored omics data have not been thoroughly investigated. The common approach is to binarize the survival time and perform a classification analysis. Here, we develop a benchmarking design, SurvBenchmark, that evaluates a diverse collection of survival models for both clinical and omics data sets. SurvBenchmark not only focuses on classical approaches such as the Cox model but also evaluates state-of-the-art machine learning survival models. All approaches were assessed using multiple performance metrics; these include model predictability, stability, flexibility, and computational issues. Our systematic comparison design with 320 comparisons (20 methods over 16 data sets) shows that the performances of survival models vary in practice over real-world data sets and over the choice of the evaluation metric. In particular, we highlight that using multiple performance metrics is critical in providing a balanced assessment of various models. The results in our study will provide practical guidelines for translational scientists and clinicians, as well as define possible areas of investigation in both survival technique and benchmarking strategies. Oxford University Press 2022-07-30 /pmc/articles/PMC9338425/ /pubmed/35906887 http://dx.doi.org/10.1093/gigascience/giac071 Text en © The Author(s) 2022. Published by Oxford University Press GigaScience. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Zhang, Yunwei Wong, Germaine Mann, Graham Muller, Samuel Yang, Jean Y H SurvBenchmark: comprehensive benchmarking study of survival analysis methods using both omics data and clinical data |
title | SurvBenchmark: comprehensive benchmarking study of survival analysis methods using both omics data and clinical data |
title_full | SurvBenchmark: comprehensive benchmarking study of survival analysis methods using both omics data and clinical data |
title_fullStr | SurvBenchmark: comprehensive benchmarking study of survival analysis methods using both omics data and clinical data |
title_full_unstemmed | SurvBenchmark: comprehensive benchmarking study of survival analysis methods using both omics data and clinical data |
title_short | SurvBenchmark: comprehensive benchmarking study of survival analysis methods using both omics data and clinical data |
title_sort | survbenchmark: comprehensive benchmarking study of survival analysis methods using both omics data and clinical data |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9338425/ https://www.ncbi.nlm.nih.gov/pubmed/35906887 http://dx.doi.org/10.1093/gigascience/giac071 |
work_keys_str_mv | AT zhangyunwei survbenchmarkcomprehensivebenchmarkingstudyofsurvivalanalysismethodsusingbothomicsdataandclinicaldata AT wonggermaine survbenchmarkcomprehensivebenchmarkingstudyofsurvivalanalysismethodsusingbothomicsdataandclinicaldata AT manngraham survbenchmarkcomprehensivebenchmarkingstudyofsurvivalanalysismethodsusingbothomicsdataandclinicaldata AT mullersamuel survbenchmarkcomprehensivebenchmarkingstudyofsurvivalanalysismethodsusingbothomicsdataandclinicaldata AT yangjeanyh survbenchmarkcomprehensivebenchmarkingstudyofsurvivalanalysismethodsusingbothomicsdataandclinicaldata |