Cargando…
A Topic Recognition Method of News Text Based on Word Embedding Enhancement
Topic recognition technology has been commonly applied to identify different categories of news topics from the vast amount of web information, which has a wide application prospect in the field of online public opinion monitoring, news recommendation, and so on. However, it is very challenging to e...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8865979/ https://www.ncbi.nlm.nih.gov/pubmed/35222628 http://dx.doi.org/10.1155/2022/4582480 |
_version_ | 1784655736014372864 |
---|---|
author | Du, Qiming Li, Nan Liu, Wenfu Sun, Daozhu Yang, Shudan Yue, Feng |
author_facet | Du, Qiming Li, Nan Liu, Wenfu Sun, Daozhu Yang, Shudan Yue, Feng |
author_sort | Du, Qiming |
collection | PubMed |
description | Topic recognition technology has been commonly applied to identify different categories of news topics from the vast amount of web information, which has a wide application prospect in the field of online public opinion monitoring, news recommendation, and so on. However, it is very challenging to effectively utilize key feature information such as syntax and semantics in the text to improve topic recognition accuracy. Some researchers proposed to combine the topic model with the word embedding model, whose results had shown that this approach could enrich text representation and benefit natural language processing downstream tasks. However, for the topic recognition problem of news texts, there is currently no standard way of combining topic model and word embedding model. Besides, some existing similar approaches were more complex and did not consider the fusion between topic distribution of different granularity and word embedding information. Therefore, this paper proposes a novel text representation method based on word embedding enhancement and further forms a full-process topic recognition framework for news text. In contrast to traditional topic recognition methods, this framework is designed to use the probabilistic topic model LDA, the word embedding models Word2vec and Glove to fully extract and integrate the topic distribution, semantic knowledge, and syntactic relationship of the text, and then use popular classifiers to automatically recognize the topic categories of news based on the obtained text representation vectors. As a result, the proposed framework can take advantage of the relationship between document and topic and the context information, which improves the expressive ability and reduces the dimensionality. Based on the two benchmark datasets of 20NewsGroup and BBC News, the experimental results verify the effectiveness and superiority of the proposed method based on word embedding enhancement for the news topic recognition problem. |
format | Online Article Text |
id | pubmed-8865979 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-88659792022-02-24 A Topic Recognition Method of News Text Based on Word Embedding Enhancement Du, Qiming Li, Nan Liu, Wenfu Sun, Daozhu Yang, Shudan Yue, Feng Comput Intell Neurosci Research Article Topic recognition technology has been commonly applied to identify different categories of news topics from the vast amount of web information, which has a wide application prospect in the field of online public opinion monitoring, news recommendation, and so on. However, it is very challenging to effectively utilize key feature information such as syntax and semantics in the text to improve topic recognition accuracy. Some researchers proposed to combine the topic model with the word embedding model, whose results had shown that this approach could enrich text representation and benefit natural language processing downstream tasks. However, for the topic recognition problem of news texts, there is currently no standard way of combining topic model and word embedding model. Besides, some existing similar approaches were more complex and did not consider the fusion between topic distribution of different granularity and word embedding information. Therefore, this paper proposes a novel text representation method based on word embedding enhancement and further forms a full-process topic recognition framework for news text. In contrast to traditional topic recognition methods, this framework is designed to use the probabilistic topic model LDA, the word embedding models Word2vec and Glove to fully extract and integrate the topic distribution, semantic knowledge, and syntactic relationship of the text, and then use popular classifiers to automatically recognize the topic categories of news based on the obtained text representation vectors. As a result, the proposed framework can take advantage of the relationship between document and topic and the context information, which improves the expressive ability and reduces the dimensionality. Based on the two benchmark datasets of 20NewsGroup and BBC News, the experimental results verify the effectiveness and superiority of the proposed method based on word embedding enhancement for the news topic recognition problem. Hindawi 2022-02-16 /pmc/articles/PMC8865979/ /pubmed/35222628 http://dx.doi.org/10.1155/2022/4582480 Text en Copyright © 2022 Qiming Du et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Du, Qiming Li, Nan Liu, Wenfu Sun, Daozhu Yang, Shudan Yue, Feng A Topic Recognition Method of News Text Based on Word Embedding Enhancement |
title | A Topic Recognition Method of News Text Based on Word Embedding Enhancement |
title_full | A Topic Recognition Method of News Text Based on Word Embedding Enhancement |
title_fullStr | A Topic Recognition Method of News Text Based on Word Embedding Enhancement |
title_full_unstemmed | A Topic Recognition Method of News Text Based on Word Embedding Enhancement |
title_short | A Topic Recognition Method of News Text Based on Word Embedding Enhancement |
title_sort | topic recognition method of news text based on word embedding enhancement |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8865979/ https://www.ncbi.nlm.nih.gov/pubmed/35222628 http://dx.doi.org/10.1155/2022/4582480 |
work_keys_str_mv | AT duqiming atopicrecognitionmethodofnewstextbasedonwordembeddingenhancement AT linan atopicrecognitionmethodofnewstextbasedonwordembeddingenhancement AT liuwenfu atopicrecognitionmethodofnewstextbasedonwordembeddingenhancement AT sundaozhu atopicrecognitionmethodofnewstextbasedonwordembeddingenhancement AT yangshudan atopicrecognitionmethodofnewstextbasedonwordembeddingenhancement AT yuefeng atopicrecognitionmethodofnewstextbasedonwordembeddingenhancement AT duqiming topicrecognitionmethodofnewstextbasedonwordembeddingenhancement AT linan topicrecognitionmethodofnewstextbasedonwordembeddingenhancement AT liuwenfu topicrecognitionmethodofnewstextbasedonwordembeddingenhancement AT sundaozhu topicrecognitionmethodofnewstextbasedonwordembeddingenhancement AT yangshudan topicrecognitionmethodofnewstextbasedonwordembeddingenhancement AT yuefeng topicrecognitionmethodofnewstextbasedonwordembeddingenhancement |