Cargando…

A nonparametric Bayesian method of translating machine learning scores to probabilities in clinical decision support

BACKGROUND: Probabilistic assessments of clinical care are essential for quality care. Yet, machine learning, which supports this care process has been limited to categorical results. To maximize its usefulness, it is important to find novel approaches that calibrate the ML output with a likelihood...

Descripción completa

Detalles Bibliográficos
Autores principales:	Connolly, Brian, Cohen, K. Bretonnel, Santel, Daniel, Bayram, Ulya, Pestian, John
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2017
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5545857/ https://www.ncbi.nlm.nih.gov/pubmed/28784111 http://dx.doi.org/10.1186/s12859-017-1736-3

_version_	1783255497538797568
author	Connolly, Brian Cohen, K. Bretonnel Santel, Daniel Bayram, Ulya Pestian, John
author_facet	Connolly, Brian Cohen, K. Bretonnel Santel, Daniel Bayram, Ulya Pestian, John
author_sort	Connolly, Brian
collection	PubMed
description	BACKGROUND: Probabilistic assessments of clinical care are essential for quality care. Yet, machine learning, which supports this care process has been limited to categorical results. To maximize its usefulness, it is important to find novel approaches that calibrate the ML output with a likelihood scale. Current state-of-the-art calibration methods are generally accurate and applicable to many ML models, but improved granularity and accuracy of such methods would increase the information available for clinical decision making. This novel non-parametric Bayesian approach is demonstrated on a variety of data sets, including simulated classifier outputs, biomedical data sets from the University of California, Irvine (UCI) Machine Learning Repository, and a clinical data set built to determine suicide risk from the language of emergency department patients. RESULTS: The method is first demonstrated on support-vector machine (SVM) models, which generally produce well-behaved, well understood scores. The method produces calibrations that are comparable to the state-of-the-art Bayesian Binning in Quantiles (BBQ) method when the SVM models are able to effectively separate cases and controls. However, as the SVM models’ ability to discriminate classes decreases, our approach yields more granular and dynamic calibrated probabilities comparing to the BBQ method. Improvements in granularity and range are even more dramatic when the discrimination between the classes is artificially degraded by replacing the SVM model with an ad hoc k-means classifier. CONCLUSIONS: The method allows both clinicians and patients to have a more nuanced view of the output of an ML model, allowing better decision making. The method is demonstrated on simulated data, various biomedical data sets and a clinical data set, to which diverse ML methods are applied. Trivially extending the method to (non-ML) clinical scores is also discussed.
format	Online Article Text
id	pubmed-5545857
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-55458572017-08-09 A nonparametric Bayesian method of translating machine learning scores to probabilities in clinical decision support Connolly, Brian Cohen, K. Bretonnel Santel, Daniel Bayram, Ulya Pestian, John BMC Bioinformatics Methodology Article BACKGROUND: Probabilistic assessments of clinical care are essential for quality care. Yet, machine learning, which supports this care process has been limited to categorical results. To maximize its usefulness, it is important to find novel approaches that calibrate the ML output with a likelihood scale. Current state-of-the-art calibration methods are generally accurate and applicable to many ML models, but improved granularity and accuracy of such methods would increase the information available for clinical decision making. This novel non-parametric Bayesian approach is demonstrated on a variety of data sets, including simulated classifier outputs, biomedical data sets from the University of California, Irvine (UCI) Machine Learning Repository, and a clinical data set built to determine suicide risk from the language of emergency department patients. RESULTS: The method is first demonstrated on support-vector machine (SVM) models, which generally produce well-behaved, well understood scores. The method produces calibrations that are comparable to the state-of-the-art Bayesian Binning in Quantiles (BBQ) method when the SVM models are able to effectively separate cases and controls. However, as the SVM models’ ability to discriminate classes decreases, our approach yields more granular and dynamic calibrated probabilities comparing to the BBQ method. Improvements in granularity and range are even more dramatic when the discrimination between the classes is artificially degraded by replacing the SVM model with an ad hoc k-means classifier. CONCLUSIONS: The method allows both clinicians and patients to have a more nuanced view of the output of an ML model, allowing better decision making. The method is demonstrated on simulated data, various biomedical data sets and a clinical data set, to which diverse ML methods are applied. Trivially extending the method to (non-ML) clinical scores is also discussed. BioMed Central 2017-08-07 /pmc/articles/PMC5545857/ /pubmed/28784111 http://dx.doi.org/10.1186/s12859-017-1736-3 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Methodology Article Connolly, Brian Cohen, K. Bretonnel Santel, Daniel Bayram, Ulya Pestian, John A nonparametric Bayesian method of translating machine learning scores to probabilities in clinical decision support
title	A nonparametric Bayesian method of translating machine learning scores to probabilities in clinical decision support
title_full	A nonparametric Bayesian method of translating machine learning scores to probabilities in clinical decision support
title_fullStr	A nonparametric Bayesian method of translating machine learning scores to probabilities in clinical decision support
title_full_unstemmed	A nonparametric Bayesian method of translating machine learning scores to probabilities in clinical decision support
title_short	A nonparametric Bayesian method of translating machine learning scores to probabilities in clinical decision support
title_sort	nonparametric bayesian method of translating machine learning scores to probabilities in clinical decision support
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5545857/ https://www.ncbi.nlm.nih.gov/pubmed/28784111 http://dx.doi.org/10.1186/s12859-017-1736-3
work_keys_str_mv	AT connollybrian anonparametricbayesianmethodoftranslatingmachinelearningscorestoprobabilitiesinclinicaldecisionsupport AT cohenkbretonnel anonparametricbayesianmethodoftranslatingmachinelearningscorestoprobabilitiesinclinicaldecisionsupport AT santeldaniel anonparametricbayesianmethodoftranslatingmachinelearningscorestoprobabilitiesinclinicaldecisionsupport AT bayramulya anonparametricbayesianmethodoftranslatingmachinelearningscorestoprobabilitiesinclinicaldecisionsupport AT pestianjohn anonparametricbayesianmethodoftranslatingmachinelearningscorestoprobabilitiesinclinicaldecisionsupport AT connollybrian nonparametricbayesianmethodoftranslatingmachinelearningscorestoprobabilitiesinclinicaldecisionsupport AT cohenkbretonnel nonparametricbayesianmethodoftranslatingmachinelearningscorestoprobabilitiesinclinicaldecisionsupport AT santeldaniel nonparametricbayesianmethodoftranslatingmachinelearningscorestoprobabilitiesinclinicaldecisionsupport AT bayramulya nonparametricbayesianmethodoftranslatingmachinelearningscorestoprobabilitiesinclinicaldecisionsupport AT pestianjohn nonparametricbayesianmethodoftranslatingmachinelearningscorestoprobabilitiesinclinicaldecisionsupport

A nonparametric Bayesian method of translating machine learning scores to probabilities in clinical decision support

Ejemplares similares