Cargando…

Enhancing web search result clustering model based on multiview multirepresentation consensus cluster ensemble (mmcc) approach

Existing text clustering methods utilize only one representation at a time (single view), whereas multiple views can represent documents. The multiview multirepresentation method enhances clustering quality. Moreover, existing clustering methods that utilize more than one representation at a time (m...

Descripción completa

Detalles Bibliográficos
Autores principales: Sabah, Ali, Tiun, Sabrina, Sani, Nor Samsiah, Ayob, Masri, Taha, Adil Yaseen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7810326/
https://www.ncbi.nlm.nih.gov/pubmed/33449949
http://dx.doi.org/10.1371/journal.pone.0245264
_version_ 1783637293624459264
author Sabah, Ali
Tiun, Sabrina
Sani, Nor Samsiah
Ayob, Masri
Taha, Adil Yaseen
author_facet Sabah, Ali
Tiun, Sabrina
Sani, Nor Samsiah
Ayob, Masri
Taha, Adil Yaseen
author_sort Sabah, Ali
collection PubMed
description Existing text clustering methods utilize only one representation at a time (single view), whereas multiple views can represent documents. The multiview multirepresentation method enhances clustering quality. Moreover, existing clustering methods that utilize more than one representation at a time (multiview) use representation with the same nature. Hence, using multiple views that represent data in a different representation with clustering methods is reasonable to create a diverse set of candidate clustering solutions. On this basis, an effective dynamic clustering method must consider combining multiple views of data including semantic view, lexical view (word weighting), and topic view as well as the number of clusters. The main goal of this study is to develop a new method that can improve the performance of web search result clustering (WSRC). An enhanced multiview multirepresentation consensus clustering ensemble (MMCC) method is proposed to create a set of diverse candidate solutions and select a high-quality overlapping cluster. The overlapping clusters are obtained from the candidate solutions created by different clustering methods. The framework to develop the proposed MMCC includes numerous stages: (1) acquiring the standard datasets (MORESQUE and Open Directory Project-239), which are used to validate search result clustering algorithms, (2) preprocessing the dataset, (3) applying multiview multirepresentation clustering models, (4) using the radius-based cluster number estimation algorithm, and (5) employing the consensus clustering ensemble method. Results show an improvement in clustering methods when multiview multirepresentation is used. More importantly, the proposed MMCC model improves the overall performance of WSRC compared with all single-view clustering models.
format Online
Article
Text
id pubmed-7810326
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-78103262021-01-27 Enhancing web search result clustering model based on multiview multirepresentation consensus cluster ensemble (mmcc) approach Sabah, Ali Tiun, Sabrina Sani, Nor Samsiah Ayob, Masri Taha, Adil Yaseen PLoS One Research Article Existing text clustering methods utilize only one representation at a time (single view), whereas multiple views can represent documents. The multiview multirepresentation method enhances clustering quality. Moreover, existing clustering methods that utilize more than one representation at a time (multiview) use representation with the same nature. Hence, using multiple views that represent data in a different representation with clustering methods is reasonable to create a diverse set of candidate clustering solutions. On this basis, an effective dynamic clustering method must consider combining multiple views of data including semantic view, lexical view (word weighting), and topic view as well as the number of clusters. The main goal of this study is to develop a new method that can improve the performance of web search result clustering (WSRC). An enhanced multiview multirepresentation consensus clustering ensemble (MMCC) method is proposed to create a set of diverse candidate solutions and select a high-quality overlapping cluster. The overlapping clusters are obtained from the candidate solutions created by different clustering methods. The framework to develop the proposed MMCC includes numerous stages: (1) acquiring the standard datasets (MORESQUE and Open Directory Project-239), which are used to validate search result clustering algorithms, (2) preprocessing the dataset, (3) applying multiview multirepresentation clustering models, (4) using the radius-based cluster number estimation algorithm, and (5) employing the consensus clustering ensemble method. Results show an improvement in clustering methods when multiview multirepresentation is used. More importantly, the proposed MMCC model improves the overall performance of WSRC compared with all single-view clustering models. Public Library of Science 2021-01-15 /pmc/articles/PMC7810326/ /pubmed/33449949 http://dx.doi.org/10.1371/journal.pone.0245264 Text en © 2021 Sabah et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Sabah, Ali
Tiun, Sabrina
Sani, Nor Samsiah
Ayob, Masri
Taha, Adil Yaseen
Enhancing web search result clustering model based on multiview multirepresentation consensus cluster ensemble (mmcc) approach
title Enhancing web search result clustering model based on multiview multirepresentation consensus cluster ensemble (mmcc) approach
title_full Enhancing web search result clustering model based on multiview multirepresentation consensus cluster ensemble (mmcc) approach
title_fullStr Enhancing web search result clustering model based on multiview multirepresentation consensus cluster ensemble (mmcc) approach
title_full_unstemmed Enhancing web search result clustering model based on multiview multirepresentation consensus cluster ensemble (mmcc) approach
title_short Enhancing web search result clustering model based on multiview multirepresentation consensus cluster ensemble (mmcc) approach
title_sort enhancing web search result clustering model based on multiview multirepresentation consensus cluster ensemble (mmcc) approach
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7810326/
https://www.ncbi.nlm.nih.gov/pubmed/33449949
http://dx.doi.org/10.1371/journal.pone.0245264
work_keys_str_mv AT sabahali enhancingwebsearchresultclusteringmodelbasedonmultiviewmultirepresentationconsensusclusterensemblemmccapproach
AT tiunsabrina enhancingwebsearchresultclusteringmodelbasedonmultiviewmultirepresentationconsensusclusterensemblemmccapproach
AT saninorsamsiah enhancingwebsearchresultclusteringmodelbasedonmultiviewmultirepresentationconsensusclusterensemblemmccapproach
AT ayobmasri enhancingwebsearchresultclusteringmodelbasedonmultiviewmultirepresentationconsensusclusterensemblemmccapproach
AT tahaadilyaseen enhancingwebsearchresultclusteringmodelbasedonmultiviewmultirepresentationconsensusclusterensemblemmccapproach