Cargando…
Entropy based C4.5-SHO algorithm with information gain optimization in data mining
Information efficiency is gaining more importance in the development as well as application sectors of information technology. Data mining is a computer-assisted process of massive data investigation that extracts meaningful information from the datasets. The mined information is used in decision-ma...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8049126/ https://www.ncbi.nlm.nih.gov/pubmed/33954229 http://dx.doi.org/10.7717/peerj-cs.424 |
_version_ | 1783679368466268160 |
---|---|
author | Reddy, G Sekhar Chittineni, Suneetha |
author_facet | Reddy, G Sekhar Chittineni, Suneetha |
author_sort | Reddy, G Sekhar |
collection | PubMed |
description | Information efficiency is gaining more importance in the development as well as application sectors of information technology. Data mining is a computer-assisted process of massive data investigation that extracts meaningful information from the datasets. The mined information is used in decision-making to understand the behavior of each attribute. Therefore, a new classification algorithm is introduced in this paper to improve information management. The classical C4.5 decision tree approach is combined with the Selfish Herd Optimization (SHO) algorithm to tune the gain of given datasets. The optimal weights for the information gain will be updated based on SHO. Further, the dataset is partitioned into two classes based on quadratic entropy calculation and information gain. Decision tree gain optimization is the main aim of our proposed C4.5-SHO method. The robustness of the proposed method is evaluated on various datasets and compared with classifiers, such as ID3 and CART. The accuracy and area under the receiver operating characteristic curve parameters are estimated and compared with existing algorithms like ant colony optimization, particle swarm optimization and cuckoo search. |
format | Online Article Text |
id | pubmed-8049126 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-80491262021-05-04 Entropy based C4.5-SHO algorithm with information gain optimization in data mining Reddy, G Sekhar Chittineni, Suneetha PeerJ Comput Sci Algorithms and Analysis of Algorithms Information efficiency is gaining more importance in the development as well as application sectors of information technology. Data mining is a computer-assisted process of massive data investigation that extracts meaningful information from the datasets. The mined information is used in decision-making to understand the behavior of each attribute. Therefore, a new classification algorithm is introduced in this paper to improve information management. The classical C4.5 decision tree approach is combined with the Selfish Herd Optimization (SHO) algorithm to tune the gain of given datasets. The optimal weights for the information gain will be updated based on SHO. Further, the dataset is partitioned into two classes based on quadratic entropy calculation and information gain. Decision tree gain optimization is the main aim of our proposed C4.5-SHO method. The robustness of the proposed method is evaluated on various datasets and compared with classifiers, such as ID3 and CART. The accuracy and area under the receiver operating characteristic curve parameters are estimated and compared with existing algorithms like ant colony optimization, particle swarm optimization and cuckoo search. PeerJ Inc. 2021-04-07 /pmc/articles/PMC8049126/ /pubmed/33954229 http://dx.doi.org/10.7717/peerj-cs.424 Text en © 2021 Reddy and Chittineni https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited. |
spellingShingle | Algorithms and Analysis of Algorithms Reddy, G Sekhar Chittineni, Suneetha Entropy based C4.5-SHO algorithm with information gain optimization in data mining |
title | Entropy based C4.5-SHO algorithm with information gain optimization in data mining |
title_full | Entropy based C4.5-SHO algorithm with information gain optimization in data mining |
title_fullStr | Entropy based C4.5-SHO algorithm with information gain optimization in data mining |
title_full_unstemmed | Entropy based C4.5-SHO algorithm with information gain optimization in data mining |
title_short | Entropy based C4.5-SHO algorithm with information gain optimization in data mining |
title_sort | entropy based c4.5-sho algorithm with information gain optimization in data mining |
topic | Algorithms and Analysis of Algorithms |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8049126/ https://www.ncbi.nlm.nih.gov/pubmed/33954229 http://dx.doi.org/10.7717/peerj-cs.424 |
work_keys_str_mv | AT reddygsekhar entropybasedc45shoalgorithmwithinformationgainoptimizationindatamining AT chittinenisuneetha entropybasedc45shoalgorithmwithinformationgainoptimizationindatamining |