Cargando…

Entropy based C4.5-SHO algorithm with information gain optimization in data mining

Information efficiency is gaining more importance in the development as well as application sectors of information technology. Data mining is a computer-assisted process of massive data investigation that extracts meaningful information from the datasets. The mined information is used in decision-ma...

Descripción completa

Detalles Bibliográficos
Autores principales: Reddy, G Sekhar, Chittineni, Suneetha
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8049126/
https://www.ncbi.nlm.nih.gov/pubmed/33954229
http://dx.doi.org/10.7717/peerj-cs.424
_version_ 1783679368466268160
author Reddy, G Sekhar
Chittineni, Suneetha
author_facet Reddy, G Sekhar
Chittineni, Suneetha
author_sort Reddy, G Sekhar
collection PubMed
description Information efficiency is gaining more importance in the development as well as application sectors of information technology. Data mining is a computer-assisted process of massive data investigation that extracts meaningful information from the datasets. The mined information is used in decision-making to understand the behavior of each attribute. Therefore, a new classification algorithm is introduced in this paper to improve information management. The classical C4.5 decision tree approach is combined with the Selfish Herd Optimization (SHO) algorithm to tune the gain of given datasets. The optimal weights for the information gain will be updated based on SHO. Further, the dataset is partitioned into two classes based on quadratic entropy calculation and information gain. Decision tree gain optimization is the main aim of our proposed C4.5-SHO method. The robustness of the proposed method is evaluated on various datasets and compared with classifiers, such as ID3 and CART. The accuracy and area under the receiver operating characteristic curve parameters are estimated and compared with existing algorithms like ant colony optimization, particle swarm optimization and cuckoo search.
format Online
Article
Text
id pubmed-8049126
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-80491262021-05-04 Entropy based C4.5-SHO algorithm with information gain optimization in data mining Reddy, G Sekhar Chittineni, Suneetha PeerJ Comput Sci Algorithms and Analysis of Algorithms Information efficiency is gaining more importance in the development as well as application sectors of information technology. Data mining is a computer-assisted process of massive data investigation that extracts meaningful information from the datasets. The mined information is used in decision-making to understand the behavior of each attribute. Therefore, a new classification algorithm is introduced in this paper to improve information management. The classical C4.5 decision tree approach is combined with the Selfish Herd Optimization (SHO) algorithm to tune the gain of given datasets. The optimal weights for the information gain will be updated based on SHO. Further, the dataset is partitioned into two classes based on quadratic entropy calculation and information gain. Decision tree gain optimization is the main aim of our proposed C4.5-SHO method. The robustness of the proposed method is evaluated on various datasets and compared with classifiers, such as ID3 and CART. The accuracy and area under the receiver operating characteristic curve parameters are estimated and compared with existing algorithms like ant colony optimization, particle swarm optimization and cuckoo search. PeerJ Inc. 2021-04-07 /pmc/articles/PMC8049126/ /pubmed/33954229 http://dx.doi.org/10.7717/peerj-cs.424 Text en © 2021 Reddy and Chittineni https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Algorithms and Analysis of Algorithms
Reddy, G Sekhar
Chittineni, Suneetha
Entropy based C4.5-SHO algorithm with information gain optimization in data mining
title Entropy based C4.5-SHO algorithm with information gain optimization in data mining
title_full Entropy based C4.5-SHO algorithm with information gain optimization in data mining
title_fullStr Entropy based C4.5-SHO algorithm with information gain optimization in data mining
title_full_unstemmed Entropy based C4.5-SHO algorithm with information gain optimization in data mining
title_short Entropy based C4.5-SHO algorithm with information gain optimization in data mining
title_sort entropy based c4.5-sho algorithm with information gain optimization in data mining
topic Algorithms and Analysis of Algorithms
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8049126/
https://www.ncbi.nlm.nih.gov/pubmed/33954229
http://dx.doi.org/10.7717/peerj-cs.424
work_keys_str_mv AT reddygsekhar entropybasedc45shoalgorithmwithinformationgainoptimizationindatamining
AT chittinenisuneetha entropybasedc45shoalgorithmwithinformationgainoptimizationindatamining