Cargando…

Interpretability With Accurate Small Models

Models often need to be constrained to a certain size for them to be considered interpretable. For example, a decision tree of depth 5 is much easier to understand than one of depth 50. Limiting model size, however, often reduces accuracy. We suggest a practical technique that minimizes this trade-o...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ghose, Abhishek, Ravindran, Balaraman
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2020
Materias:	Artificial Intelligence
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7861231/ https://www.ncbi.nlm.nih.gov/pubmed/33733123 http://dx.doi.org/10.3389/frai.2020.00003

_version_	1783647040485457920
author	Ghose, Abhishek Ravindran, Balaraman
author_facet	Ghose, Abhishek Ravindran, Balaraman
author_sort	Ghose, Abhishek
collection	PubMed
description	Models often need to be constrained to a certain size for them to be considered interpretable. For example, a decision tree of depth 5 is much easier to understand than one of depth 50. Limiting model size, however, often reduces accuracy. We suggest a practical technique that minimizes this trade-off between interpretability and classification accuracy. This enables an arbitrary learning algorithm to produce highly accurate small-sized models. Our technique identifies the training data distribution to learn from that leads to the highest accuracy for a model of a given size. We represent the training distribution as a combination of sampling schemes. Each scheme is defined by a parameterized probability mass function applied to the segmentation produced by a decision tree. An Infinite Mixture Model with Beta components is used to represent a combination of such schemes. The mixture model parameters are learned using Bayesian Optimization. Under simplistic assumptions, we would need to optimize for O(d) variables for a distribution over a d-dimensional input space, which is cumbersome for most real-world data. However, we show that our technique significantly reduces this number to a fixed set of eight variables at the cost of relatively cheap preprocessing. The proposed technique is flexible: it is model-agnostic, i.e., it may be applied to the learning algorithm for any model family, and it admits a general notion of model size. We demonstrate its effectiveness using multiple real-world datasets to construct decision trees, linear probability models and gradient boosted models with different sizes. We observe significant improvements in the F1-score in most instances, exceeding an improvement of 100% in some cases.
format	Online Article Text
id	pubmed-7861231
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-78612312021-03-16 Interpretability With Accurate Small Models Ghose, Abhishek Ravindran, Balaraman Front Artif Intell Artificial Intelligence Models often need to be constrained to a certain size for them to be considered interpretable. For example, a decision tree of depth 5 is much easier to understand than one of depth 50. Limiting model size, however, often reduces accuracy. We suggest a practical technique that minimizes this trade-off between interpretability and classification accuracy. This enables an arbitrary learning algorithm to produce highly accurate small-sized models. Our technique identifies the training data distribution to learn from that leads to the highest accuracy for a model of a given size. We represent the training distribution as a combination of sampling schemes. Each scheme is defined by a parameterized probability mass function applied to the segmentation produced by a decision tree. An Infinite Mixture Model with Beta components is used to represent a combination of such schemes. The mixture model parameters are learned using Bayesian Optimization. Under simplistic assumptions, we would need to optimize for O(d) variables for a distribution over a d-dimensional input space, which is cumbersome for most real-world data. However, we show that our technique significantly reduces this number to a fixed set of eight variables at the cost of relatively cheap preprocessing. The proposed technique is flexible: it is model-agnostic, i.e., it may be applied to the learning algorithm for any model family, and it admits a general notion of model size. We demonstrate its effectiveness using multiple real-world datasets to construct decision trees, linear probability models and gradient boosted models with different sizes. We observe significant improvements in the F1-score in most instances, exceeding an improvement of 100% in some cases. Frontiers Media S.A. 2020-02-25 /pmc/articles/PMC7861231/ /pubmed/33733123 http://dx.doi.org/10.3389/frai.2020.00003 Text en Copyright © 2020 Ghose and Ravindran. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Artificial Intelligence Ghose, Abhishek Ravindran, Balaraman Interpretability With Accurate Small Models
title	Interpretability With Accurate Small Models
title_full	Interpretability With Accurate Small Models
title_fullStr	Interpretability With Accurate Small Models
title_full_unstemmed	Interpretability With Accurate Small Models
title_short	Interpretability With Accurate Small Models
title_sort	interpretability with accurate small models
topic	Artificial Intelligence
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7861231/ https://www.ncbi.nlm.nih.gov/pubmed/33733123 http://dx.doi.org/10.3389/frai.2020.00003
work_keys_str_mv	AT ghoseabhishek interpretabilitywithaccuratesmallmodels AT ravindranbalaraman interpretabilitywithaccuratesmallmodels

Interpretability With Accurate Small Models

Ejemplares similares