Cargando…
Towards Semantically Sensitive Text Clustering: A Feature Space Modeling Technology Based on Dimension Extension
The objective of text clustering is to divide document collections into clusters based on the similarity between documents. In this paper, an extension-based feature modeling approach towards semantically sensitive text clustering is proposed along with the corresponding feature space construction a...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4367988/ https://www.ncbi.nlm.nih.gov/pubmed/25794172 http://dx.doi.org/10.1371/journal.pone.0117390 |
_version_ | 1782362578953961472 |
---|---|
author | Liu, Yuanchao Liu, Ming Wang, Xin |
author_facet | Liu, Yuanchao Liu, Ming Wang, Xin |
author_sort | Liu, Yuanchao |
collection | PubMed |
description | The objective of text clustering is to divide document collections into clusters based on the similarity between documents. In this paper, an extension-based feature modeling approach towards semantically sensitive text clustering is proposed along with the corresponding feature space construction and similarity computation method. By combining the similarity in traditional feature space and that in extension space, the adverse effects of the complexity and diversity of natural language can be addressed and clustering semantic sensitivity can be improved correspondingly. The generated clusters can be organized using different granularities. The experimental evaluations on well-known clustering algorithms and datasets have verified the effectiveness of our approach. |
format | Online Article Text |
id | pubmed-4367988 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-43679882015-03-27 Towards Semantically Sensitive Text Clustering: A Feature Space Modeling Technology Based on Dimension Extension Liu, Yuanchao Liu, Ming Wang, Xin PLoS One Research Article The objective of text clustering is to divide document collections into clusters based on the similarity between documents. In this paper, an extension-based feature modeling approach towards semantically sensitive text clustering is proposed along with the corresponding feature space construction and similarity computation method. By combining the similarity in traditional feature space and that in extension space, the adverse effects of the complexity and diversity of natural language can be addressed and clustering semantic sensitivity can be improved correspondingly. The generated clusters can be organized using different granularities. The experimental evaluations on well-known clustering algorithms and datasets have verified the effectiveness of our approach. Public Library of Science 2015-03-20 /pmc/articles/PMC4367988/ /pubmed/25794172 http://dx.doi.org/10.1371/journal.pone.0117390 Text en © 2015 Liu et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Liu, Yuanchao Liu, Ming Wang, Xin Towards Semantically Sensitive Text Clustering: A Feature Space Modeling Technology Based on Dimension Extension |
title | Towards Semantically Sensitive Text Clustering: A Feature Space Modeling Technology Based on Dimension Extension |
title_full | Towards Semantically Sensitive Text Clustering: A Feature Space Modeling Technology Based on Dimension Extension |
title_fullStr | Towards Semantically Sensitive Text Clustering: A Feature Space Modeling Technology Based on Dimension Extension |
title_full_unstemmed | Towards Semantically Sensitive Text Clustering: A Feature Space Modeling Technology Based on Dimension Extension |
title_short | Towards Semantically Sensitive Text Clustering: A Feature Space Modeling Technology Based on Dimension Extension |
title_sort | towards semantically sensitive text clustering: a feature space modeling technology based on dimension extension |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4367988/ https://www.ncbi.nlm.nih.gov/pubmed/25794172 http://dx.doi.org/10.1371/journal.pone.0117390 |
work_keys_str_mv | AT liuyuanchao towardssemanticallysensitivetextclusteringafeaturespacemodelingtechnologybasedondimensionextension AT liuming towardssemanticallysensitivetextclusteringafeaturespacemodelingtechnologybasedondimensionextension AT wangxin towardssemanticallysensitivetextclusteringafeaturespacemodelingtechnologybasedondimensionextension |