Cargando…

Towards Semantically Sensitive Text Clustering: A Feature Space Modeling Technology Based on Dimension Extension

The objective of text clustering is to divide document collections into clusters based on the similarity between documents. In this paper, an extension-based feature modeling approach towards semantically sensitive text clustering is proposed along with the corresponding feature space construction a...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Yuanchao, Liu, Ming, Wang, Xin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4367988/
https://www.ncbi.nlm.nih.gov/pubmed/25794172
http://dx.doi.org/10.1371/journal.pone.0117390
_version_ 1782362578953961472
author Liu, Yuanchao
Liu, Ming
Wang, Xin
author_facet Liu, Yuanchao
Liu, Ming
Wang, Xin
author_sort Liu, Yuanchao
collection PubMed
description The objective of text clustering is to divide document collections into clusters based on the similarity between documents. In this paper, an extension-based feature modeling approach towards semantically sensitive text clustering is proposed along with the corresponding feature space construction and similarity computation method. By combining the similarity in traditional feature space and that in extension space, the adverse effects of the complexity and diversity of natural language can be addressed and clustering semantic sensitivity can be improved correspondingly. The generated clusters can be organized using different granularities. The experimental evaluations on well-known clustering algorithms and datasets have verified the effectiveness of our approach.
format Online
Article
Text
id pubmed-4367988
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-43679882015-03-27 Towards Semantically Sensitive Text Clustering: A Feature Space Modeling Technology Based on Dimension Extension Liu, Yuanchao Liu, Ming Wang, Xin PLoS One Research Article The objective of text clustering is to divide document collections into clusters based on the similarity between documents. In this paper, an extension-based feature modeling approach towards semantically sensitive text clustering is proposed along with the corresponding feature space construction and similarity computation method. By combining the similarity in traditional feature space and that in extension space, the adverse effects of the complexity and diversity of natural language can be addressed and clustering semantic sensitivity can be improved correspondingly. The generated clusters can be organized using different granularities. The experimental evaluations on well-known clustering algorithms and datasets have verified the effectiveness of our approach. Public Library of Science 2015-03-20 /pmc/articles/PMC4367988/ /pubmed/25794172 http://dx.doi.org/10.1371/journal.pone.0117390 Text en © 2015 Liu et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Liu, Yuanchao
Liu, Ming
Wang, Xin
Towards Semantically Sensitive Text Clustering: A Feature Space Modeling Technology Based on Dimension Extension
title Towards Semantically Sensitive Text Clustering: A Feature Space Modeling Technology Based on Dimension Extension
title_full Towards Semantically Sensitive Text Clustering: A Feature Space Modeling Technology Based on Dimension Extension
title_fullStr Towards Semantically Sensitive Text Clustering: A Feature Space Modeling Technology Based on Dimension Extension
title_full_unstemmed Towards Semantically Sensitive Text Clustering: A Feature Space Modeling Technology Based on Dimension Extension
title_short Towards Semantically Sensitive Text Clustering: A Feature Space Modeling Technology Based on Dimension Extension
title_sort towards semantically sensitive text clustering: a feature space modeling technology based on dimension extension
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4367988/
https://www.ncbi.nlm.nih.gov/pubmed/25794172
http://dx.doi.org/10.1371/journal.pone.0117390
work_keys_str_mv AT liuyuanchao towardssemanticallysensitivetextclusteringafeaturespacemodelingtechnologybasedondimensionextension
AT liuming towardssemanticallysensitivetextclusteringafeaturespacemodelingtechnologybasedondimensionextension
AT wangxin towardssemanticallysensitivetextclusteringafeaturespacemodelingtechnologybasedondimensionextension