Cargando…

BenchMetrics Prob: benchmarking of probabilistic error/loss performance evaluation instruments for binary classification problems

Probabilistic error/loss performance evaluation instruments that are originally used for regression and time series forecasting are also applied in some binary-class or multi-class classifiers, such as artificial neural networks. This study aims to systematically assess probabilistic instruments for...

Descripción completa

Detalles Bibliográficos
Autor principal:	Canbek, Gürol
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer Berlin Heidelberg 2023
Materias:	Original Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10113998/ https://www.ncbi.nlm.nih.gov/pubmed/37360884 http://dx.doi.org/10.1007/s13042-023-01826-5

_version_	1785027938773630976
author	Canbek, Gürol
author_facet	Canbek, Gürol
author_sort	Canbek, Gürol
collection	PubMed
description	Probabilistic error/loss performance evaluation instruments that are originally used for regression and time series forecasting are also applied in some binary-class or multi-class classifiers, such as artificial neural networks. This study aims to systematically assess probabilistic instruments for binary classification performance evaluation using a proposed two-stage benchmarking method called BenchMetrics Prob. The method employs five criteria and fourteen simulation cases based on hypothetical classifiers on synthetic datasets. The goal is to reveal specific weaknesses of performance instruments and to identify the most robust instrument in binary classification problems. The BenchMetrics Prob method was tested on 31 instrument/instrument variants, and the results have identified four instruments as the most robust in a binary classification context: Sum Squared Error (SSE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE, as the variant of MSE), and Mean Absolute Error (MAE). As SSE has lower interpretability due to its [0, ∞) range, MAE in [0, 1] is the most convenient and robust probabilistic metric for generic purposes. In classification problems where large errors are more important than small errors, RMSE may be a better choice. Additionally, the results showed that instrument variants with summarization functions other than mean (e.g., median and geometric mean), LogLoss, and the error instruments with relative/percentage/symmetric-percentage subtypes for regression, such as Mean Absolute Percentage Error (MAPE), Symmetric MAPE (sMAPE), and Mean Relative Absolute Error (MRAE), were less robust and should be avoided. These findings suggest that researchers should employ robust probabilistic metrics when measuring and reporting performance in binary classification problems.
format	Online Article Text
id	pubmed-10113998
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Springer Berlin Heidelberg
record_format	MEDLINE/PubMed
spelling	pubmed-101139982023-04-20 BenchMetrics Prob: benchmarking of probabilistic error/loss performance evaluation instruments for binary classification problems Canbek, Gürol Int J Mach Learn Cybern Original Article Probabilistic error/loss performance evaluation instruments that are originally used for regression and time series forecasting are also applied in some binary-class or multi-class classifiers, such as artificial neural networks. This study aims to systematically assess probabilistic instruments for binary classification performance evaluation using a proposed two-stage benchmarking method called BenchMetrics Prob. The method employs five criteria and fourteen simulation cases based on hypothetical classifiers on synthetic datasets. The goal is to reveal specific weaknesses of performance instruments and to identify the most robust instrument in binary classification problems. The BenchMetrics Prob method was tested on 31 instrument/instrument variants, and the results have identified four instruments as the most robust in a binary classification context: Sum Squared Error (SSE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE, as the variant of MSE), and Mean Absolute Error (MAE). As SSE has lower interpretability due to its [0, ∞) range, MAE in [0, 1] is the most convenient and robust probabilistic metric for generic purposes. In classification problems where large errors are more important than small errors, RMSE may be a better choice. Additionally, the results showed that instrument variants with summarization functions other than mean (e.g., median and geometric mean), LogLoss, and the error instruments with relative/percentage/symmetric-percentage subtypes for regression, such as Mean Absolute Percentage Error (MAPE), Symmetric MAPE (sMAPE), and Mean Relative Absolute Error (MRAE), were less robust and should be avoided. These findings suggest that researchers should employ robust probabilistic metrics when measuring and reporting performance in binary classification problems. Springer Berlin Heidelberg 2023-04-19 /pmc/articles/PMC10113998/ /pubmed/37360884 http://dx.doi.org/10.1007/s13042-023-01826-5 Text en © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023, Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle	Original Article Canbek, Gürol BenchMetrics Prob: benchmarking of probabilistic error/loss performance evaluation instruments for binary classification problems
title	BenchMetrics Prob: benchmarking of probabilistic error/loss performance evaluation instruments for binary classification problems
title_full	BenchMetrics Prob: benchmarking of probabilistic error/loss performance evaluation instruments for binary classification problems
title_fullStr	BenchMetrics Prob: benchmarking of probabilistic error/loss performance evaluation instruments for binary classification problems
title_full_unstemmed	BenchMetrics Prob: benchmarking of probabilistic error/loss performance evaluation instruments for binary classification problems
title_short	BenchMetrics Prob: benchmarking of probabilistic error/loss performance evaluation instruments for binary classification problems
title_sort	benchmetrics prob: benchmarking of probabilistic error/loss performance evaluation instruments for binary classification problems
topic	Original Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10113998/ https://www.ncbi.nlm.nih.gov/pubmed/37360884 http://dx.doi.org/10.1007/s13042-023-01826-5
work_keys_str_mv	AT canbekgurol benchmetricsprobbenchmarkingofprobabilisticerrorlossperformanceevaluationinstrumentsforbinaryclassificationproblems

BenchMetrics Prob: benchmarking of probabilistic error/loss performance evaluation instruments for binary classification problems

Ejemplares similares