Cargando…

Avoiding C-hacking when evaluating survival distribution predictions with discrimination measures

MOTIVATION: In this article, we consider how to evaluate survival distribution predictions with measures of discrimination. This is non-trivial as discrimination measures are the most commonly used in survival analysis and yet there is no clear method to derive a risk prediction from a distribution...

Descripción completa

Detalles Bibliográficos
Autores principales:	Sonabend, Raphael, Bender, Andreas, Vollmer, Sebastian
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2022
Materias:	Original Papers
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9438958/ https://www.ncbi.nlm.nih.gov/pubmed/35818973 http://dx.doi.org/10.1093/bioinformatics/btac451

_version_	1784781942576644096
author	Sonabend, Raphael Bender, Andreas Vollmer, Sebastian
author_facet	Sonabend, Raphael Bender, Andreas Vollmer, Sebastian
author_sort	Sonabend, Raphael
collection	PubMed
description	MOTIVATION: In this article, we consider how to evaluate survival distribution predictions with measures of discrimination. This is non-trivial as discrimination measures are the most commonly used in survival analysis and yet there is no clear method to derive a risk prediction from a distribution prediction. We survey methods proposed in literature and software and consider their respective advantages and disadvantages. RESULTS: Whilst distributions are frequently evaluated by discrimination measures, we find that the method for doing so is rarely described in the literature and often leads to unfair comparisons or ‘C-hacking’. We demonstrate by example how simple it can be to manipulate results and use this to argue for better reporting guidelines and transparency in the literature. We recommend that machine learning survival analysis software implements clear transformations between distribution and risk predictions in order to allow more transparent and accessible model evaluation. AVAILABILITY AND IMPLEMENTATION: The code used in the final experiment is available at https://github.com/RaphaelS1/distribution_discrimination.
format	Online Article Text
id	pubmed-9438958
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-94389582022-09-06 Avoiding C-hacking when evaluating survival distribution predictions with discrimination measures Sonabend, Raphael Bender, Andreas Vollmer, Sebastian Bioinformatics Original Papers MOTIVATION: In this article, we consider how to evaluate survival distribution predictions with measures of discrimination. This is non-trivial as discrimination measures are the most commonly used in survival analysis and yet there is no clear method to derive a risk prediction from a distribution prediction. We survey methods proposed in literature and software and consider their respective advantages and disadvantages. RESULTS: Whilst distributions are frequently evaluated by discrimination measures, we find that the method for doing so is rarely described in the literature and often leads to unfair comparisons or ‘C-hacking’. We demonstrate by example how simple it can be to manipulate results and use this to argue for better reporting guidelines and transparency in the literature. We recommend that machine learning survival analysis software implements clear transformations between distribution and risk predictions in order to allow more transparent and accessible model evaluation. AVAILABILITY AND IMPLEMENTATION: The code used in the final experiment is available at https://github.com/RaphaelS1/distribution_discrimination. Oxford University Press 2022-07-12 /pmc/articles/PMC9438958/ /pubmed/35818973 http://dx.doi.org/10.1093/bioinformatics/btac451 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Original Papers Sonabend, Raphael Bender, Andreas Vollmer, Sebastian Avoiding C-hacking when evaluating survival distribution predictions with discrimination measures
title	Avoiding C-hacking when evaluating survival distribution predictions with discrimination measures
title_full	Avoiding C-hacking when evaluating survival distribution predictions with discrimination measures
title_fullStr	Avoiding C-hacking when evaluating survival distribution predictions with discrimination measures
title_full_unstemmed	Avoiding C-hacking when evaluating survival distribution predictions with discrimination measures
title_short	Avoiding C-hacking when evaluating survival distribution predictions with discrimination measures
title_sort	avoiding c-hacking when evaluating survival distribution predictions with discrimination measures
topic	Original Papers
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9438958/ https://www.ncbi.nlm.nih.gov/pubmed/35818973 http://dx.doi.org/10.1093/bioinformatics/btac451
work_keys_str_mv	AT sonabendraphael avoidingchackingwhenevaluatingsurvivaldistributionpredictionswithdiscriminationmeasures AT benderandreas avoidingchackingwhenevaluatingsurvivaldistributionpredictionswithdiscriminationmeasures AT vollmersebastian avoidingchackingwhenevaluatingsurvivaldistributionpredictionswithdiscriminationmeasures

Avoiding C-hacking when evaluating survival distribution predictions with discrimination measures

Ejemplares similares