Cargando…

The Univariate Flagging Algorithm (UFA): An interpretable approach for predictive modeling

In many data classification problems, a number of methods will give similar accuracy. However, when working with people who are not experts in data science such as doctors, lawyers, and judges among others, finding interpretable algorithms can be a critical success factor. Practitioners have a deep...

Descripción completa

Detalles Bibliográficos
Autores principales:	Sheth, Mallory, Gerovitch, Albert, Welsch, Roy, Markuzon, Natasha
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2019
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6788700/ https://www.ncbi.nlm.nih.gov/pubmed/31603902 http://dx.doi.org/10.1371/journal.pone.0223161

_version_	1783458516827111424
author	Sheth, Mallory Gerovitch, Albert Welsch, Roy Markuzon, Natasha
author_facet	Sheth, Mallory Gerovitch, Albert Welsch, Roy Markuzon, Natasha
author_sort	Sheth, Mallory
collection	PubMed
description	In many data classification problems, a number of methods will give similar accuracy. However, when working with people who are not experts in data science such as doctors, lawyers, and judges among others, finding interpretable algorithms can be a critical success factor. Practitioners have a deep understanding of the individual input variables but far less insight into how they interact with each other. For example, there may be ranges of an input variable for which the observed outcome is significantly more or less likely. This paper describes an algorithm for automatic detection of such thresholds, called the Univariate Flagging Algorithm (UFA). The algorithm searches for a separation that optimizes the difference between separated areas while obtaining a high level of support. We evaluate its performance using six sample datasets and demonstrate that thresholds identified by the algorithm align well with published results and known physiological boundaries. We also introduce two classification approaches that use UFA and show that the performance attained on unseen test data is comparable to or better than traditional classifiers when confidence intervals are considered. We identify conditions under which UFA performs well, including applications with large amounts of missing or noisy data, applications with a large number of inputs relative to observations, and applications where incidence of the target is low. We argue that ease of explanation of the results, robustness to missing data and noise, and detection of low incidence adverse outcomes are desirable features for clinical applications that can be achieved with relatively simple classifier, like UFA.
format	Online Article Text
id	pubmed-6788700
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-67887002019-10-25 The Univariate Flagging Algorithm (UFA): An interpretable approach for predictive modeling Sheth, Mallory Gerovitch, Albert Welsch, Roy Markuzon, Natasha PLoS One Research Article In many data classification problems, a number of methods will give similar accuracy. However, when working with people who are not experts in data science such as doctors, lawyers, and judges among others, finding interpretable algorithms can be a critical success factor. Practitioners have a deep understanding of the individual input variables but far less insight into how they interact with each other. For example, there may be ranges of an input variable for which the observed outcome is significantly more or less likely. This paper describes an algorithm for automatic detection of such thresholds, called the Univariate Flagging Algorithm (UFA). The algorithm searches for a separation that optimizes the difference between separated areas while obtaining a high level of support. We evaluate its performance using six sample datasets and demonstrate that thresholds identified by the algorithm align well with published results and known physiological boundaries. We also introduce two classification approaches that use UFA and show that the performance attained on unseen test data is comparable to or better than traditional classifiers when confidence intervals are considered. We identify conditions under which UFA performs well, including applications with large amounts of missing or noisy data, applications with a large number of inputs relative to observations, and applications where incidence of the target is low. We argue that ease of explanation of the results, robustness to missing data and noise, and detection of low incidence adverse outcomes are desirable features for clinical applications that can be achieved with relatively simple classifier, like UFA. Public Library of Science 2019-10-11 /pmc/articles/PMC6788700/ /pubmed/31603902 http://dx.doi.org/10.1371/journal.pone.0223161 Text en © 2019 Sheth et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Sheth, Mallory Gerovitch, Albert Welsch, Roy Markuzon, Natasha The Univariate Flagging Algorithm (UFA): An interpretable approach for predictive modeling
title	The Univariate Flagging Algorithm (UFA): An interpretable approach for predictive modeling
title_full	The Univariate Flagging Algorithm (UFA): An interpretable approach for predictive modeling
title_fullStr	The Univariate Flagging Algorithm (UFA): An interpretable approach for predictive modeling
title_full_unstemmed	The Univariate Flagging Algorithm (UFA): An interpretable approach for predictive modeling
title_short	The Univariate Flagging Algorithm (UFA): An interpretable approach for predictive modeling
title_sort	univariate flagging algorithm (ufa): an interpretable approach for predictive modeling
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6788700/ https://www.ncbi.nlm.nih.gov/pubmed/31603902 http://dx.doi.org/10.1371/journal.pone.0223161
work_keys_str_mv	AT shethmallory theunivariateflaggingalgorithmufaaninterpretableapproachforpredictivemodeling AT gerovitchalbert theunivariateflaggingalgorithmufaaninterpretableapproachforpredictivemodeling AT welschroy theunivariateflaggingalgorithmufaaninterpretableapproachforpredictivemodeling AT markuzonnatasha theunivariateflaggingalgorithmufaaninterpretableapproachforpredictivemodeling AT shethmallory univariateflaggingalgorithmufaaninterpretableapproachforpredictivemodeling AT gerovitchalbert univariateflaggingalgorithmufaaninterpretableapproachforpredictivemodeling AT welschroy univariateflaggingalgorithmufaaninterpretableapproachforpredictivemodeling AT markuzonnatasha univariateflaggingalgorithmufaaninterpretableapproachforpredictivemodeling

The Univariate Flagging Algorithm (UFA): An interpretable approach for predictive modeling

Ejemplares similares