Cargando…

A Method of Short Text Representation Fusion with Weighted Word Embeddings and Extended Topic Information

Short text representation is one of the basic and key tasks of NLP. The traditional method is to simply merge the bag-of-words model and the topic model, which may lead to the problem of ambiguity in semantic information, and leave topic information sparse. We propose an unsupervised text representa...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Wenfu, Pang, Jianmin, Du, Qiming, Li, Nan, Yang, Shudan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8839561/
https://www.ncbi.nlm.nih.gov/pubmed/35161808
http://dx.doi.org/10.3390/s22031066
Descripción
Sumario:Short text representation is one of the basic and key tasks of NLP. The traditional method is to simply merge the bag-of-words model and the topic model, which may lead to the problem of ambiguity in semantic information, and leave topic information sparse. We propose an unsupervised text representation method that involves fusing word embeddings and extended topic information. Following this, two fusion strategies of weighted word embeddings and extended topic information are designed: static linear fusion and dynamic fusion. This method can highlight important semantic information, flexibly fuse topic information, and improve the capabilities of short text representation. We use classification and prediction tasks to verify the effectiveness of the method. The testing results show that the method is valid.