Cargando…

Interval Coded Scoring: a toolbox for interpretable scoring systems

Over the last decades, clinical decision support systems have been gaining importance. They help clinicians to make effective use of the overload of available information to obtain correct diagnoses and appropriate treatments. However, their power often comes at the cost of a black box model which c...

Descripción completa

Detalles Bibliográficos
Autores principales:	Billiet, Lieven, Van Huffel, Sabine, Van Belle, Vanya
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	PeerJ Inc. 2018
Materias:	Data Mining and Machine Learning
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924521/ https://www.ncbi.nlm.nih.gov/pubmed/33816804 http://dx.doi.org/10.7717/peerj-cs.150

_version_	1783659107347070976
author	Billiet, Lieven Van Huffel, Sabine Van Belle, Vanya
author_facet	Billiet, Lieven Van Huffel, Sabine Van Belle, Vanya
author_sort	Billiet, Lieven
collection	PubMed
description	Over the last decades, clinical decision support systems have been gaining importance. They help clinicians to make effective use of the overload of available information to obtain correct diagnoses and appropriate treatments. However, their power often comes at the cost of a black box model which cannot be interpreted easily. This interpretability is of paramount importance in a medical setting with regard to trust and (legal) responsibility. In contrast, existing medical scoring systems are easy to understand and use, but they are often a simplified rule-of-thumb summary of previous medical experience rather than a well-founded system based on available data. Interval Coded Scoring (ICS) connects these two approaches, exploiting the power of sparse optimization to derive scoring systems from training data. The presented toolbox interface makes this theory easily applicable to both small and large datasets. It contains two possible problem formulations based on linear programming or elastic net. Both allow to construct a model for a binary classification problem and establish risk profiles that can be used for future diagnosis. All of this requires only a few lines of code. ICS differs from standard machine learning through its model consisting of interpretable main effects and interactions. Furthermore, insertion of expert knowledge is possible because the training can be semi-automatic. This allows end users to make a trade-off between complexity and performance based on cross-validation results and expert knowledge. Additionally, the toolbox offers an accessible way to assess classification performance via accuracy and the ROC curve, whereas the calibration of the risk profile can be evaluated via a calibration curve. Finally, the colour-coded model visualization has particular appeal if one wants to apply ICS manually on new observations, as well as for validation by experts in the specific application domains. The validity and applicability of the toolbox is demonstrated by comparing it to standard Machine Learning approaches such as Naive Bayes and Support Vector Machines for several real-life datasets. These case studies on medical problems show its applicability as a decision support system. ICS performs similarly in terms of classification and calibration. Its slightly lower performance is countered by its model simplicity which makes it the method of choice if interpretability is a key issue.
format	Online Article Text
id	pubmed-7924521
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	PeerJ Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-79245212021-04-02 Interval Coded Scoring: a toolbox for interpretable scoring systems Billiet, Lieven Van Huffel, Sabine Van Belle, Vanya PeerJ Comput Sci Data Mining and Machine Learning Over the last decades, clinical decision support systems have been gaining importance. They help clinicians to make effective use of the overload of available information to obtain correct diagnoses and appropriate treatments. However, their power often comes at the cost of a black box model which cannot be interpreted easily. This interpretability is of paramount importance in a medical setting with regard to trust and (legal) responsibility. In contrast, existing medical scoring systems are easy to understand and use, but they are often a simplified rule-of-thumb summary of previous medical experience rather than a well-founded system based on available data. Interval Coded Scoring (ICS) connects these two approaches, exploiting the power of sparse optimization to derive scoring systems from training data. The presented toolbox interface makes this theory easily applicable to both small and large datasets. It contains two possible problem formulations based on linear programming or elastic net. Both allow to construct a model for a binary classification problem and establish risk profiles that can be used for future diagnosis. All of this requires only a few lines of code. ICS differs from standard machine learning through its model consisting of interpretable main effects and interactions. Furthermore, insertion of expert knowledge is possible because the training can be semi-automatic. This allows end users to make a trade-off between complexity and performance based on cross-validation results and expert knowledge. Additionally, the toolbox offers an accessible way to assess classification performance via accuracy and the ROC curve, whereas the calibration of the risk profile can be evaluated via a calibration curve. Finally, the colour-coded model visualization has particular appeal if one wants to apply ICS manually on new observations, as well as for validation by experts in the specific application domains. The validity and applicability of the toolbox is demonstrated by comparing it to standard Machine Learning approaches such as Naive Bayes and Support Vector Machines for several real-life datasets. These case studies on medical problems show its applicability as a decision support system. ICS performs similarly in terms of classification and calibration. Its slightly lower performance is countered by its model simplicity which makes it the method of choice if interpretability is a key issue. PeerJ Inc. 2018-04-02 /pmc/articles/PMC7924521/ /pubmed/33816804 http://dx.doi.org/10.7717/peerj-cs.150 Text en ©2018 Billiet et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle	Data Mining and Machine Learning Billiet, Lieven Van Huffel, Sabine Van Belle, Vanya Interval Coded Scoring: a toolbox for interpretable scoring systems
title	Interval Coded Scoring: a toolbox for interpretable scoring systems
title_full	Interval Coded Scoring: a toolbox for interpretable scoring systems
title_fullStr	Interval Coded Scoring: a toolbox for interpretable scoring systems
title_full_unstemmed	Interval Coded Scoring: a toolbox for interpretable scoring systems
title_short	Interval Coded Scoring: a toolbox for interpretable scoring systems
title_sort	interval coded scoring: a toolbox for interpretable scoring systems
topic	Data Mining and Machine Learning
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924521/ https://www.ncbi.nlm.nih.gov/pubmed/33816804 http://dx.doi.org/10.7717/peerj-cs.150
work_keys_str_mv	AT billietlieven intervalcodedscoringatoolboxforinterpretablescoringsystems AT vanhuffelsabine intervalcodedscoringatoolboxforinterpretablescoringsystems AT vanbellevanya intervalcodedscoringatoolboxforinterpretablescoringsystems

Interval Coded Scoring: a toolbox for interpretable scoring systems

Ejemplares similares