Cargando…
Benchmarking Analysis of the Accuracy of Classification Methods Related to Entropy
In the machine learning literature we can find numerous methods to solve classification problems. We propose two new performance measures to analyze such methods. These measures are defined by using the concept of proportional reduction of classification error with respect to three benchmark classif...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8306704/ https://www.ncbi.nlm.nih.gov/pubmed/34356391 http://dx.doi.org/10.3390/e23070850 |
_version_ | 1783727874755264512 |
---|---|
author | Orenes, Yolanda Rabasa, Alejandro Rodriguez-Sala, Jesus Javier Sanchez-Soriano, Joaquin |
author_facet | Orenes, Yolanda Rabasa, Alejandro Rodriguez-Sala, Jesus Javier Sanchez-Soriano, Joaquin |
author_sort | Orenes, Yolanda |
collection | PubMed |
description | In the machine learning literature we can find numerous methods to solve classification problems. We propose two new performance measures to analyze such methods. These measures are defined by using the concept of proportional reduction of classification error with respect to three benchmark classifiers, the random and two intuitive classifiers which are based on how a non-expert person could realize classification simply by applying a frequentist approach. We show that these three simple methods are closely related to different aspects of the entropy of the dataset. Therefore, these measures account somewhat for entropy in the dataset when evaluating the performance of classifiers. This allows us to measure the improvement in the classification results compared to simple methods, and at the same time how entropy affects classification capacity. To illustrate how these new performance measures can be used to analyze classifiers taking into account the entropy of the dataset, we carry out an intensive experiment in which we use the well-known J48 algorithm, and a UCI repository dataset on which we have previously selected a subset of the most relevant attributes. Then we carry out an extensive experiment in which we consider four heuristic classifiers, and 11 datasets. |
format | Online Article Text |
id | pubmed-8306704 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-83067042021-07-25 Benchmarking Analysis of the Accuracy of Classification Methods Related to Entropy Orenes, Yolanda Rabasa, Alejandro Rodriguez-Sala, Jesus Javier Sanchez-Soriano, Joaquin Entropy (Basel) Article In the machine learning literature we can find numerous methods to solve classification problems. We propose two new performance measures to analyze such methods. These measures are defined by using the concept of proportional reduction of classification error with respect to three benchmark classifiers, the random and two intuitive classifiers which are based on how a non-expert person could realize classification simply by applying a frequentist approach. We show that these three simple methods are closely related to different aspects of the entropy of the dataset. Therefore, these measures account somewhat for entropy in the dataset when evaluating the performance of classifiers. This allows us to measure the improvement in the classification results compared to simple methods, and at the same time how entropy affects classification capacity. To illustrate how these new performance measures can be used to analyze classifiers taking into account the entropy of the dataset, we carry out an intensive experiment in which we use the well-known J48 algorithm, and a UCI repository dataset on which we have previously selected a subset of the most relevant attributes. Then we carry out an extensive experiment in which we consider four heuristic classifiers, and 11 datasets. MDPI 2021-07-01 /pmc/articles/PMC8306704/ /pubmed/34356391 http://dx.doi.org/10.3390/e23070850 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Orenes, Yolanda Rabasa, Alejandro Rodriguez-Sala, Jesus Javier Sanchez-Soriano, Joaquin Benchmarking Analysis of the Accuracy of Classification Methods Related to Entropy |
title | Benchmarking Analysis of the Accuracy of Classification Methods Related to Entropy |
title_full | Benchmarking Analysis of the Accuracy of Classification Methods Related to Entropy |
title_fullStr | Benchmarking Analysis of the Accuracy of Classification Methods Related to Entropy |
title_full_unstemmed | Benchmarking Analysis of the Accuracy of Classification Methods Related to Entropy |
title_short | Benchmarking Analysis of the Accuracy of Classification Methods Related to Entropy |
title_sort | benchmarking analysis of the accuracy of classification methods related to entropy |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8306704/ https://www.ncbi.nlm.nih.gov/pubmed/34356391 http://dx.doi.org/10.3390/e23070850 |
work_keys_str_mv | AT orenesyolanda benchmarkinganalysisoftheaccuracyofclassificationmethodsrelatedtoentropy AT rabasaalejandro benchmarkinganalysisoftheaccuracyofclassificationmethodsrelatedtoentropy AT rodriguezsalajesusjavier benchmarkinganalysisoftheaccuracyofclassificationmethodsrelatedtoentropy AT sanchezsorianojoaquin benchmarkinganalysisoftheaccuracyofclassificationmethodsrelatedtoentropy |