Cargando…

Calculation of exact Shapley values for explaining support vector machine models using the radial basis function kernel

Machine learning (ML) algorithms are extensively used in pharmaceutical research. Most ML models have black-box character, thus preventing the interpretation of predictions. However, rationalizing model decisions is of critical importance if predictions should aid in experimental design. Accordingly...

Descripción completa

Detalles Bibliográficos
Autores principales:	Mastropietro, Andrea, Feldmann, Christian, Bajorath, Jürgen
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Nature Publishing Group UK 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10638308/ https://www.ncbi.nlm.nih.gov/pubmed/37949930 http://dx.doi.org/10.1038/s41598-023-46930-2

_version_	1785133572489740288
author	Mastropietro, Andrea Feldmann, Christian Bajorath, Jürgen
author_facet	Mastropietro, Andrea Feldmann, Christian Bajorath, Jürgen
author_sort	Mastropietro, Andrea
collection	PubMed
description	Machine learning (ML) algorithms are extensively used in pharmaceutical research. Most ML models have black-box character, thus preventing the interpretation of predictions. However, rationalizing model decisions is of critical importance if predictions should aid in experimental design. Accordingly, in interdisciplinary research, there is growing interest in explaining ML models. Methods devised for this purpose are a part of the explainable artificial intelligence (XAI) spectrum of approaches. In XAI, the Shapley value concept originating from cooperative game theory has become popular for identifying features determining predictions. The Shapley value concept has been adapted as a model-agnostic approach for explaining predictions. Since the computational time required for Shapley value calculations scales exponentially with the number of features used, local approximations such as Shapley additive explanations (SHAP) are usually required in ML. The support vector machine (SVM) algorithm is one of the most popular ML methods in pharmaceutical research and beyond. SVM models are often explained using SHAP. However, there is only limited correlation between SHAP and exact Shapley values, as previously demonstrated for SVM calculations using the Tanimoto kernel, which limits SVM model explanation. Since the Tanimoto kernel is a special kernel function mostly applied for assessing chemical similarity, we have developed the Shapley value-expressed radial basis function (SVERAD), a computationally efficient approach for the calculation of exact Shapley values for SVM models based upon radial basis function kernels that are widely applied in different areas. SVERAD is shown to produce meaningful explanations of SVM predictions.
format	Online Article Text
id	pubmed-10638308
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Nature Publishing Group UK
record_format	MEDLINE/PubMed
spelling	pubmed-106383082023-11-11 Calculation of exact Shapley values for explaining support vector machine models using the radial basis function kernel Mastropietro, Andrea Feldmann, Christian Bajorath, Jürgen Sci Rep Article Machine learning (ML) algorithms are extensively used in pharmaceutical research. Most ML models have black-box character, thus preventing the interpretation of predictions. However, rationalizing model decisions is of critical importance if predictions should aid in experimental design. Accordingly, in interdisciplinary research, there is growing interest in explaining ML models. Methods devised for this purpose are a part of the explainable artificial intelligence (XAI) spectrum of approaches. In XAI, the Shapley value concept originating from cooperative game theory has become popular for identifying features determining predictions. The Shapley value concept has been adapted as a model-agnostic approach for explaining predictions. Since the computational time required for Shapley value calculations scales exponentially with the number of features used, local approximations such as Shapley additive explanations (SHAP) are usually required in ML. The support vector machine (SVM) algorithm is one of the most popular ML methods in pharmaceutical research and beyond. SVM models are often explained using SHAP. However, there is only limited correlation between SHAP and exact Shapley values, as previously demonstrated for SVM calculations using the Tanimoto kernel, which limits SVM model explanation. Since the Tanimoto kernel is a special kernel function mostly applied for assessing chemical similarity, we have developed the Shapley value-expressed radial basis function (SVERAD), a computationally efficient approach for the calculation of exact Shapley values for SVM models based upon radial basis function kernels that are widely applied in different areas. SVERAD is shown to produce meaningful explanations of SVM predictions. Nature Publishing Group UK 2023-11-10 /pmc/articles/PMC10638308/ /pubmed/37949930 http://dx.doi.org/10.1038/s41598-023-46930-2 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle	Article Mastropietro, Andrea Feldmann, Christian Bajorath, Jürgen Calculation of exact Shapley values for explaining support vector machine models using the radial basis function kernel
title	Calculation of exact Shapley values for explaining support vector machine models using the radial basis function kernel
title_full	Calculation of exact Shapley values for explaining support vector machine models using the radial basis function kernel
title_fullStr	Calculation of exact Shapley values for explaining support vector machine models using the radial basis function kernel
title_full_unstemmed	Calculation of exact Shapley values for explaining support vector machine models using the radial basis function kernel
title_short	Calculation of exact Shapley values for explaining support vector machine models using the radial basis function kernel
title_sort	calculation of exact shapley values for explaining support vector machine models using the radial basis function kernel
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10638308/ https://www.ncbi.nlm.nih.gov/pubmed/37949930 http://dx.doi.org/10.1038/s41598-023-46930-2
work_keys_str_mv	AT mastropietroandrea calculationofexactshapleyvaluesforexplainingsupportvectormachinemodelsusingtheradialbasisfunctionkernel AT feldmannchristian calculationofexactshapleyvaluesforexplainingsupportvectormachinemodelsusingtheradialbasisfunctionkernel AT bajorathjurgen calculationofexactshapleyvaluesforexplainingsupportvectormachinemodelsusingtheradialbasisfunctionkernel

Calculation of exact Shapley values for explaining support vector machine models using the radial basis function kernel

Ejemplares similares