Cargando…

TKFIM: Top-K frequent itemset mining technique based on equivalence classes

Frequently used items mining is a significant subject of data mining studies. In the last ten years, due to innovative development, the quantity of data has grown exponentially. For frequent Itemset (FIs) mining applications, it imposes new challenges. Misconceived information may be found in recent...

Descripción completa

Detalles Bibliográficos
Autores principales:	Iqbal, Saood, Shahid, Abdul, Roman, Muhammad, Khan, Zahid, Al-Otaibi, Shaha, Yu, Lisu
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	PeerJ Inc. 2021
Materias:	Algorithms and Analysis of Algorithms
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7959650/ https://www.ncbi.nlm.nih.gov/pubmed/33817031 http://dx.doi.org/10.7717/peerj-cs.385

_version_	1783664995913957376
author	Iqbal, Saood Shahid, Abdul Roman, Muhammad Khan, Zahid Al-Otaibi, Shaha Yu, Lisu
author_facet	Iqbal, Saood Shahid, Abdul Roman, Muhammad Khan, Zahid Al-Otaibi, Shaha Yu, Lisu
author_sort	Iqbal, Saood
collection	PubMed
description	Frequently used items mining is a significant subject of data mining studies. In the last ten years, due to innovative development, the quantity of data has grown exponentially. For frequent Itemset (FIs) mining applications, it imposes new challenges. Misconceived information may be found in recent algorithms, including both threshold and size based algorithms. Threshold value plays a central role in generating frequent itemsets from the given dataset. Selecting a support threshold value is very complicated for those unaware of the dataset’s characteristics. The performance of algorithms for finding FIs without the support threshold is, however, deficient due to heavy computation. Therefore, we have proposed a method to discover FIs without the support threshold, called Top-k frequent itemsets mining (TKFIM). It uses class equivalence and set-theory concepts for mining FIs. The proposed procedure does not miss any FIs; thus, accurate frequent patterns are mined. Furthermore, the results are compared with state-of-the-art techniques such as Top-k miner and Build Once and Mine Once (BOMO). It is found that the proposed TKFIM has outperformed the results of these approaches in terms of execution and performance, achieving 92.70, 35.87, 28.53, and 81.27 percent gain on Top-k miner using Chess, Mushroom, and Connect and T1014D100K datasets, respectively. Similarly, it has achieved a performance gain of 97.14, 100, 78.10, 99.70 percent on BOMO using Chess, Mushroom, Connect, and T1014D100K datasets, respectively. Therefore, it is argued that the proposed procedure may be adopted on a large dataset for better performance.
format	Online Article Text
id	pubmed-7959650
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	PeerJ Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-79596502021-04-02 TKFIM: Top-K frequent itemset mining technique based on equivalence classes Iqbal, Saood Shahid, Abdul Roman, Muhammad Khan, Zahid Al-Otaibi, Shaha Yu, Lisu PeerJ Comput Sci Algorithms and Analysis of Algorithms Frequently used items mining is a significant subject of data mining studies. In the last ten years, due to innovative development, the quantity of data has grown exponentially. For frequent Itemset (FIs) mining applications, it imposes new challenges. Misconceived information may be found in recent algorithms, including both threshold and size based algorithms. Threshold value plays a central role in generating frequent itemsets from the given dataset. Selecting a support threshold value is very complicated for those unaware of the dataset’s characteristics. The performance of algorithms for finding FIs without the support threshold is, however, deficient due to heavy computation. Therefore, we have proposed a method to discover FIs without the support threshold, called Top-k frequent itemsets mining (TKFIM). It uses class equivalence and set-theory concepts for mining FIs. The proposed procedure does not miss any FIs; thus, accurate frequent patterns are mined. Furthermore, the results are compared with state-of-the-art techniques such as Top-k miner and Build Once and Mine Once (BOMO). It is found that the proposed TKFIM has outperformed the results of these approaches in terms of execution and performance, achieving 92.70, 35.87, 28.53, and 81.27 percent gain on Top-k miner using Chess, Mushroom, and Connect and T1014D100K datasets, respectively. Similarly, it has achieved a performance gain of 97.14, 100, 78.10, 99.70 percent on BOMO using Chess, Mushroom, Connect, and T1014D100K datasets, respectively. Therefore, it is argued that the proposed procedure may be adopted on a large dataset for better performance. PeerJ Inc. 2021-03-08 /pmc/articles/PMC7959650/ /pubmed/33817031 http://dx.doi.org/10.7717/peerj-cs.385 Text en ©2021 Iqbal et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle	Algorithms and Analysis of Algorithms Iqbal, Saood Shahid, Abdul Roman, Muhammad Khan, Zahid Al-Otaibi, Shaha Yu, Lisu TKFIM: Top-K frequent itemset mining technique based on equivalence classes
title	TKFIM: Top-K frequent itemset mining technique based on equivalence classes
title_full	TKFIM: Top-K frequent itemset mining technique based on equivalence classes
title_fullStr	TKFIM: Top-K frequent itemset mining technique based on equivalence classes
title_full_unstemmed	TKFIM: Top-K frequent itemset mining technique based on equivalence classes
title_short	TKFIM: Top-K frequent itemset mining technique based on equivalence classes
title_sort	tkfim: top-k frequent itemset mining technique based on equivalence classes
topic	Algorithms and Analysis of Algorithms
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7959650/ https://www.ncbi.nlm.nih.gov/pubmed/33817031 http://dx.doi.org/10.7717/peerj-cs.385
work_keys_str_mv	AT iqbalsaood tkfimtopkfrequentitemsetminingtechniquebasedonequivalenceclasses AT shahidabdul tkfimtopkfrequentitemsetminingtechniquebasedonequivalenceclasses AT romanmuhammad tkfimtopkfrequentitemsetminingtechniquebasedonequivalenceclasses AT khanzahid tkfimtopkfrequentitemsetminingtechniquebasedonequivalenceclasses AT alotaibishaha tkfimtopkfrequentitemsetminingtechniquebasedonequivalenceclasses AT yulisu tkfimtopkfrequentitemsetminingtechniquebasedonequivalenceclasses

TKFIM: Top-K frequent itemset mining technique based on equivalence classes

Ejemplares similares