Cargando…
BenchMetrics Prob: benchmarking of probabilistic error/loss performance evaluation instruments for binary classification problems
Probabilistic error/loss performance evaluation instruments that are originally used for regression and time series forecasting are also applied in some binary-class or multi-class classifiers, such as artificial neural networks. This study aims to systematically assess probabilistic instruments for...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer Berlin Heidelberg
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10113998/ https://www.ncbi.nlm.nih.gov/pubmed/37360884 http://dx.doi.org/10.1007/s13042-023-01826-5 |
_version_ | 1785027938773630976 |
---|---|
author | Canbek, Gürol |
author_facet | Canbek, Gürol |
author_sort | Canbek, Gürol |
collection | PubMed |
description | Probabilistic error/loss performance evaluation instruments that are originally used for regression and time series forecasting are also applied in some binary-class or multi-class classifiers, such as artificial neural networks. This study aims to systematically assess probabilistic instruments for binary classification performance evaluation using a proposed two-stage benchmarking method called BenchMetrics Prob. The method employs five criteria and fourteen simulation cases based on hypothetical classifiers on synthetic datasets. The goal is to reveal specific weaknesses of performance instruments and to identify the most robust instrument in binary classification problems. The BenchMetrics Prob method was tested on 31 instrument/instrument variants, and the results have identified four instruments as the most robust in a binary classification context: Sum Squared Error (SSE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE, as the variant of MSE), and Mean Absolute Error (MAE). As SSE has lower interpretability due to its [0, ∞) range, MAE in [0, 1] is the most convenient and robust probabilistic metric for generic purposes. In classification problems where large errors are more important than small errors, RMSE may be a better choice. Additionally, the results showed that instrument variants with summarization functions other than mean (e.g., median and geometric mean), LogLoss, and the error instruments with relative/percentage/symmetric-percentage subtypes for regression, such as Mean Absolute Percentage Error (MAPE), Symmetric MAPE (sMAPE), and Mean Relative Absolute Error (MRAE), were less robust and should be avoided. These findings suggest that researchers should employ robust probabilistic metrics when measuring and reporting performance in binary classification problems. |
format | Online Article Text |
id | pubmed-10113998 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Springer Berlin Heidelberg |
record_format | MEDLINE/PubMed |
spelling | pubmed-101139982023-04-20 BenchMetrics Prob: benchmarking of probabilistic error/loss performance evaluation instruments for binary classification problems Canbek, Gürol Int J Mach Learn Cybern Original Article Probabilistic error/loss performance evaluation instruments that are originally used for regression and time series forecasting are also applied in some binary-class or multi-class classifiers, such as artificial neural networks. This study aims to systematically assess probabilistic instruments for binary classification performance evaluation using a proposed two-stage benchmarking method called BenchMetrics Prob. The method employs five criteria and fourteen simulation cases based on hypothetical classifiers on synthetic datasets. The goal is to reveal specific weaknesses of performance instruments and to identify the most robust instrument in binary classification problems. The BenchMetrics Prob method was tested on 31 instrument/instrument variants, and the results have identified four instruments as the most robust in a binary classification context: Sum Squared Error (SSE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE, as the variant of MSE), and Mean Absolute Error (MAE). As SSE has lower interpretability due to its [0, ∞) range, MAE in [0, 1] is the most convenient and robust probabilistic metric for generic purposes. In classification problems where large errors are more important than small errors, RMSE may be a better choice. Additionally, the results showed that instrument variants with summarization functions other than mean (e.g., median and geometric mean), LogLoss, and the error instruments with relative/percentage/symmetric-percentage subtypes for regression, such as Mean Absolute Percentage Error (MAPE), Symmetric MAPE (sMAPE), and Mean Relative Absolute Error (MRAE), were less robust and should be avoided. These findings suggest that researchers should employ robust probabilistic metrics when measuring and reporting performance in binary classification problems. Springer Berlin Heidelberg 2023-04-19 /pmc/articles/PMC10113998/ /pubmed/37360884 http://dx.doi.org/10.1007/s13042-023-01826-5 Text en © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023, Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Original Article Canbek, Gürol BenchMetrics Prob: benchmarking of probabilistic error/loss performance evaluation instruments for binary classification problems |
title | BenchMetrics Prob: benchmarking of probabilistic error/loss performance evaluation instruments for binary classification problems |
title_full | BenchMetrics Prob: benchmarking of probabilistic error/loss performance evaluation instruments for binary classification problems |
title_fullStr | BenchMetrics Prob: benchmarking of probabilistic error/loss performance evaluation instruments for binary classification problems |
title_full_unstemmed | BenchMetrics Prob: benchmarking of probabilistic error/loss performance evaluation instruments for binary classification problems |
title_short | BenchMetrics Prob: benchmarking of probabilistic error/loss performance evaluation instruments for binary classification problems |
title_sort | benchmetrics prob: benchmarking of probabilistic error/loss performance evaluation instruments for binary classification problems |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10113998/ https://www.ncbi.nlm.nih.gov/pubmed/37360884 http://dx.doi.org/10.1007/s13042-023-01826-5 |
work_keys_str_mv | AT canbekgurol benchmetricsprobbenchmarkingofprobabilisticerrorlossperformanceevaluationinstrumentsforbinaryclassificationproblems |