Cargando…
Towards Semantically Sensitive Text Clustering: A Feature Space Modeling Technology Based on Dimension Extension
The objective of text clustering is to divide document collections into clusters based on the similarity between documents. In this paper, an extension-based feature modeling approach towards semantically sensitive text clustering is proposed along with the corresponding feature space construction a...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4367988/ https://www.ncbi.nlm.nih.gov/pubmed/25794172 http://dx.doi.org/10.1371/journal.pone.0117390 |
Sumario: | The objective of text clustering is to divide document collections into clusters based on the similarity between documents. In this paper, an extension-based feature modeling approach towards semantically sensitive text clustering is proposed along with the corresponding feature space construction and similarity computation method. By combining the similarity in traditional feature space and that in extension space, the adverse effects of the complexity and diversity of natural language can be addressed and clustering semantic sensitivity can be improved correspondingly. The generated clusters can be organized using different granularities. The experimental evaluations on well-known clustering algorithms and datasets have verified the effectiveness of our approach. |
---|