Cargando…

A Method of Short Text Representation Fusion with Weighted Word Embeddings and Extended Topic Information

Short text representation is one of the basic and key tasks of NLP. The traditional method is to simply merge the bag-of-words model and the topic model, which may lead to the problem of ambiguity in semantic information, and leave topic information sparse. We propose an unsupervised text representa...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Wenfu, Pang, Jianmin, Du, Qiming, Li, Nan, Yang, Shudan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8839561/
https://www.ncbi.nlm.nih.gov/pubmed/35161808
http://dx.doi.org/10.3390/s22031066
_version_ 1784650399052988416
author Liu, Wenfu
Pang, Jianmin
Du, Qiming
Li, Nan
Yang, Shudan
author_facet Liu, Wenfu
Pang, Jianmin
Du, Qiming
Li, Nan
Yang, Shudan
author_sort Liu, Wenfu
collection PubMed
description Short text representation is one of the basic and key tasks of NLP. The traditional method is to simply merge the bag-of-words model and the topic model, which may lead to the problem of ambiguity in semantic information, and leave topic information sparse. We propose an unsupervised text representation method that involves fusing word embeddings and extended topic information. Following this, two fusion strategies of weighted word embeddings and extended topic information are designed: static linear fusion and dynamic fusion. This method can highlight important semantic information, flexibly fuse topic information, and improve the capabilities of short text representation. We use classification and prediction tasks to verify the effectiveness of the method. The testing results show that the method is valid.
format Online
Article
Text
id pubmed-8839561
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-88395612022-02-13 A Method of Short Text Representation Fusion with Weighted Word Embeddings and Extended Topic Information Liu, Wenfu Pang, Jianmin Du, Qiming Li, Nan Yang, Shudan Sensors (Basel) Article Short text representation is one of the basic and key tasks of NLP. The traditional method is to simply merge the bag-of-words model and the topic model, which may lead to the problem of ambiguity in semantic information, and leave topic information sparse. We propose an unsupervised text representation method that involves fusing word embeddings and extended topic information. Following this, two fusion strategies of weighted word embeddings and extended topic information are designed: static linear fusion and dynamic fusion. This method can highlight important semantic information, flexibly fuse topic information, and improve the capabilities of short text representation. We use classification and prediction tasks to verify the effectiveness of the method. The testing results show that the method is valid. MDPI 2022-01-29 /pmc/articles/PMC8839561/ /pubmed/35161808 http://dx.doi.org/10.3390/s22031066 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Liu, Wenfu
Pang, Jianmin
Du, Qiming
Li, Nan
Yang, Shudan
A Method of Short Text Representation Fusion with Weighted Word Embeddings and Extended Topic Information
title A Method of Short Text Representation Fusion with Weighted Word Embeddings and Extended Topic Information
title_full A Method of Short Text Representation Fusion with Weighted Word Embeddings and Extended Topic Information
title_fullStr A Method of Short Text Representation Fusion with Weighted Word Embeddings and Extended Topic Information
title_full_unstemmed A Method of Short Text Representation Fusion with Weighted Word Embeddings and Extended Topic Information
title_short A Method of Short Text Representation Fusion with Weighted Word Embeddings and Extended Topic Information
title_sort method of short text representation fusion with weighted word embeddings and extended topic information
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8839561/
https://www.ncbi.nlm.nih.gov/pubmed/35161808
http://dx.doi.org/10.3390/s22031066
work_keys_str_mv AT liuwenfu amethodofshorttextrepresentationfusionwithweightedwordembeddingsandextendedtopicinformation
AT pangjianmin amethodofshorttextrepresentationfusionwithweightedwordembeddingsandextendedtopicinformation
AT duqiming amethodofshorttextrepresentationfusionwithweightedwordembeddingsandextendedtopicinformation
AT linan amethodofshorttextrepresentationfusionwithweightedwordembeddingsandextendedtopicinformation
AT yangshudan amethodofshorttextrepresentationfusionwithweightedwordembeddingsandextendedtopicinformation
AT liuwenfu methodofshorttextrepresentationfusionwithweightedwordembeddingsandextendedtopicinformation
AT pangjianmin methodofshorttextrepresentationfusionwithweightedwordembeddingsandextendedtopicinformation
AT duqiming methodofshorttextrepresentationfusionwithweightedwordembeddingsandextendedtopicinformation
AT linan methodofshorttextrepresentationfusionwithweightedwordembeddingsandextendedtopicinformation
AT yangshudan methodofshorttextrepresentationfusionwithweightedwordembeddingsandextendedtopicinformation