Cargando…
A Method of Short Text Representation Fusion with Weighted Word Embeddings and Extended Topic Information
Short text representation is one of the basic and key tasks of NLP. The traditional method is to simply merge the bag-of-words model and the topic model, which may lead to the problem of ambiguity in semantic information, and leave topic information sparse. We propose an unsupervised text representa...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8839561/ https://www.ncbi.nlm.nih.gov/pubmed/35161808 http://dx.doi.org/10.3390/s22031066 |
_version_ | 1784650399052988416 |
---|---|
author | Liu, Wenfu Pang, Jianmin Du, Qiming Li, Nan Yang, Shudan |
author_facet | Liu, Wenfu Pang, Jianmin Du, Qiming Li, Nan Yang, Shudan |
author_sort | Liu, Wenfu |
collection | PubMed |
description | Short text representation is one of the basic and key tasks of NLP. The traditional method is to simply merge the bag-of-words model and the topic model, which may lead to the problem of ambiguity in semantic information, and leave topic information sparse. We propose an unsupervised text representation method that involves fusing word embeddings and extended topic information. Following this, two fusion strategies of weighted word embeddings and extended topic information are designed: static linear fusion and dynamic fusion. This method can highlight important semantic information, flexibly fuse topic information, and improve the capabilities of short text representation. We use classification and prediction tasks to verify the effectiveness of the method. The testing results show that the method is valid. |
format | Online Article Text |
id | pubmed-8839561 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-88395612022-02-13 A Method of Short Text Representation Fusion with Weighted Word Embeddings and Extended Topic Information Liu, Wenfu Pang, Jianmin Du, Qiming Li, Nan Yang, Shudan Sensors (Basel) Article Short text representation is one of the basic and key tasks of NLP. The traditional method is to simply merge the bag-of-words model and the topic model, which may lead to the problem of ambiguity in semantic information, and leave topic information sparse. We propose an unsupervised text representation method that involves fusing word embeddings and extended topic information. Following this, two fusion strategies of weighted word embeddings and extended topic information are designed: static linear fusion and dynamic fusion. This method can highlight important semantic information, flexibly fuse topic information, and improve the capabilities of short text representation. We use classification and prediction tasks to verify the effectiveness of the method. The testing results show that the method is valid. MDPI 2022-01-29 /pmc/articles/PMC8839561/ /pubmed/35161808 http://dx.doi.org/10.3390/s22031066 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Liu, Wenfu Pang, Jianmin Du, Qiming Li, Nan Yang, Shudan A Method of Short Text Representation Fusion with Weighted Word Embeddings and Extended Topic Information |
title | A Method of Short Text Representation Fusion with Weighted Word Embeddings and Extended Topic Information |
title_full | A Method of Short Text Representation Fusion with Weighted Word Embeddings and Extended Topic Information |
title_fullStr | A Method of Short Text Representation Fusion with Weighted Word Embeddings and Extended Topic Information |
title_full_unstemmed | A Method of Short Text Representation Fusion with Weighted Word Embeddings and Extended Topic Information |
title_short | A Method of Short Text Representation Fusion with Weighted Word Embeddings and Extended Topic Information |
title_sort | method of short text representation fusion with weighted word embeddings and extended topic information |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8839561/ https://www.ncbi.nlm.nih.gov/pubmed/35161808 http://dx.doi.org/10.3390/s22031066 |
work_keys_str_mv | AT liuwenfu amethodofshorttextrepresentationfusionwithweightedwordembeddingsandextendedtopicinformation AT pangjianmin amethodofshorttextrepresentationfusionwithweightedwordembeddingsandextendedtopicinformation AT duqiming amethodofshorttextrepresentationfusionwithweightedwordembeddingsandextendedtopicinformation AT linan amethodofshorttextrepresentationfusionwithweightedwordembeddingsandextendedtopicinformation AT yangshudan amethodofshorttextrepresentationfusionwithweightedwordembeddingsandextendedtopicinformation AT liuwenfu methodofshorttextrepresentationfusionwithweightedwordembeddingsandextendedtopicinformation AT pangjianmin methodofshorttextrepresentationfusionwithweightedwordembeddingsandextendedtopicinformation AT duqiming methodofshorttextrepresentationfusionwithweightedwordembeddingsandextendedtopicinformation AT linan methodofshorttextrepresentationfusionwithweightedwordembeddingsandextendedtopicinformation AT yangshudan methodofshorttextrepresentationfusionwithweightedwordembeddingsandextendedtopicinformation |