Cargando…

Solving text clustering problem using a memetic differential evolution algorithm

The text clustering is considered as one of the most effective text document analysis methods, which is applied to cluster documents as a consequence of the expanded big data and online information. Based on the review of the related work of the text clustering algorithms, these algorithms achieved...

Descripción completa

Detalles Bibliográficos
Autores principales: Mustafa, Hossam M. J., Ayob, Masri, Albashish, Dheeb, Abu-Taleb, Sawsan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7289410/
https://www.ncbi.nlm.nih.gov/pubmed/32525869
http://dx.doi.org/10.1371/journal.pone.0232816
_version_ 1783545457555800064
author Mustafa, Hossam M. J.
Ayob, Masri
Albashish, Dheeb
Abu-Taleb, Sawsan
author_facet Mustafa, Hossam M. J.
Ayob, Masri
Albashish, Dheeb
Abu-Taleb, Sawsan
author_sort Mustafa, Hossam M. J.
collection PubMed
description The text clustering is considered as one of the most effective text document analysis methods, which is applied to cluster documents as a consequence of the expanded big data and online information. Based on the review of the related work of the text clustering algorithms, these algorithms achieved reasonable clustering results for some datasets, while they failed on a wide variety of benchmark datasets. Furthermore, the performance of these algorithms was not robust due to the inefficient balance between the exploitation and exploration capabilities of the clustering algorithm. Accordingly, this research proposes a Memetic Differential Evolution algorithm (MDETC) to solve the text clustering problem, which aims to address the effect of the hybridization between the differential evolution (DE) mutation strategy with the memetic algorithm (MA). This hybridization intends to enhance the quality of text clustering and improve the exploitation and exploration capabilities of the algorithm. Our experimental results based on six standard text clustering benchmark datasets (i.e. the Laboratory of Computational Intelligence (LABIC)) have shown that the MDETC algorithm outperformed other compared clustering algorithms based on AUC metric, F-measure, and the statistical analysis. Furthermore, the MDETC is compared with the state of art text clustering algorithms and obtained almost the best results for the standard benchmark datasets.
format Online
Article
Text
id pubmed-7289410
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-72894102020-06-15 Solving text clustering problem using a memetic differential evolution algorithm Mustafa, Hossam M. J. Ayob, Masri Albashish, Dheeb Abu-Taleb, Sawsan PLoS One Research Article The text clustering is considered as one of the most effective text document analysis methods, which is applied to cluster documents as a consequence of the expanded big data and online information. Based on the review of the related work of the text clustering algorithms, these algorithms achieved reasonable clustering results for some datasets, while they failed on a wide variety of benchmark datasets. Furthermore, the performance of these algorithms was not robust due to the inefficient balance between the exploitation and exploration capabilities of the clustering algorithm. Accordingly, this research proposes a Memetic Differential Evolution algorithm (MDETC) to solve the text clustering problem, which aims to address the effect of the hybridization between the differential evolution (DE) mutation strategy with the memetic algorithm (MA). This hybridization intends to enhance the quality of text clustering and improve the exploitation and exploration capabilities of the algorithm. Our experimental results based on six standard text clustering benchmark datasets (i.e. the Laboratory of Computational Intelligence (LABIC)) have shown that the MDETC algorithm outperformed other compared clustering algorithms based on AUC metric, F-measure, and the statistical analysis. Furthermore, the MDETC is compared with the state of art text clustering algorithms and obtained almost the best results for the standard benchmark datasets. Public Library of Science 2020-06-11 /pmc/articles/PMC7289410/ /pubmed/32525869 http://dx.doi.org/10.1371/journal.pone.0232816 Text en © 2020 Mustafa et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Mustafa, Hossam M. J.
Ayob, Masri
Albashish, Dheeb
Abu-Taleb, Sawsan
Solving text clustering problem using a memetic differential evolution algorithm
title Solving text clustering problem using a memetic differential evolution algorithm
title_full Solving text clustering problem using a memetic differential evolution algorithm
title_fullStr Solving text clustering problem using a memetic differential evolution algorithm
title_full_unstemmed Solving text clustering problem using a memetic differential evolution algorithm
title_short Solving text clustering problem using a memetic differential evolution algorithm
title_sort solving text clustering problem using a memetic differential evolution algorithm
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7289410/
https://www.ncbi.nlm.nih.gov/pubmed/32525869
http://dx.doi.org/10.1371/journal.pone.0232816
work_keys_str_mv AT mustafahossammj solvingtextclusteringproblemusingamemeticdifferentialevolutionalgorithm
AT ayobmasri solvingtextclusteringproblemusingamemeticdifferentialevolutionalgorithm
AT albashishdheeb solvingtextclusteringproblemusingamemeticdifferentialevolutionalgorithm
AT abutalebsawsan solvingtextclusteringproblemusingamemeticdifferentialevolutionalgorithm