Cargando…

Weighted Joint Sentiment-Topic Model for Sentiment Analysis Compared to ALGA: Adaptive Lexicon Learning Using Genetic Algorithm

Latent Dirichlet Allocation (LDA) is an approach to unsupervised learning that aims to investigate the semantics among words in a document as well as the influence of a subject on a word. As an LDA-based model, Joint Sentiment-Topic (JST) examines the impact of topics and emotions on words. The emot...

Descripción completa

Detalles Bibliográficos
Autores principales: Osmani, Amjad, Bagherzadeh Mohasefi, Jamshid
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9374039/
https://www.ncbi.nlm.nih.gov/pubmed/35965748
http://dx.doi.org/10.1155/2022/7612276
_version_ 1784767710683463680
author Osmani, Amjad
Bagherzadeh Mohasefi, Jamshid
author_facet Osmani, Amjad
Bagherzadeh Mohasefi, Jamshid
author_sort Osmani, Amjad
collection PubMed
description Latent Dirichlet Allocation (LDA) is an approach to unsupervised learning that aims to investigate the semantics among words in a document as well as the influence of a subject on a word. As an LDA-based model, Joint Sentiment-Topic (JST) examines the impact of topics and emotions on words. The emotion parameter is insufficient, and additional parameters may play valuable roles in achieving better performance. In this study, two new topic models, Weighted Joint Sentiment-Topic (WJST) and Weighted Joint Sentiment-Topic 1 (WJST1), have been presented to extend and improve JST through two new parameters that can generate a sentiment dictionary. In the proposed methods, each word in a document affects its neighbors, and different words in the document may be affected simultaneously by several neighbor words. Therefore, proposed models consider the effect of words on each other, which, from our view, is an important factor and can increase the performance of baseline methods. Regarding evaluation results, the new parameters have an immense effect on model accuracy. While not requiring labeled data, the proposed methods are more accurate than discriminative models such as SVM and logistic regression in accordance with evaluation results. The proposed methods are simple with a low number of parameters. While providing a broad perception of connections between different words in documents of a single collection (single-domain) or multiple collections (multidomain), the proposed methods have prepared solutions for two different situations (single-domain and multidomain). WJST is suitable for multidomain datasets, and WJST1 is a version of WJST which is suitable for single-domain datasets. While being able to detect emotion at the level of the document, the proposed models improve the evaluation outcomes of the baseline approaches. Thirteen datasets with different sizes have been used in implementations. In this study, perplexity, opinion mining at the level of the document, and topic_coherency are employed for assessment. Also, a statistical test called Friedman test is used to check whether the results of the proposed models are statistically different from the results of other algorithms. As can be seen from results, the accuracy of proposed methods is above 80% for most of the datasets. WJST1 achieves the highest accuracy on Movie dataset with 97 percent, and WJST achieves the highest accuracy on Electronic dataset with 86 percent. The proposed models obtain better results compared to Adaptive Lexicon learning using Genetic Algorithm (ALGA), which employs an evolutionary approach to make an emotion dictionary. Results show that the proposed methods perform better with different topic number settings, especially for WJST1 with 97% accuracy at |Z| = 5 on the Movie dataset.
format Online
Article
Text
id pubmed-9374039
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-93740392022-08-13 Weighted Joint Sentiment-Topic Model for Sentiment Analysis Compared to ALGA: Adaptive Lexicon Learning Using Genetic Algorithm Osmani, Amjad Bagherzadeh Mohasefi, Jamshid Comput Intell Neurosci Research Article Latent Dirichlet Allocation (LDA) is an approach to unsupervised learning that aims to investigate the semantics among words in a document as well as the influence of a subject on a word. As an LDA-based model, Joint Sentiment-Topic (JST) examines the impact of topics and emotions on words. The emotion parameter is insufficient, and additional parameters may play valuable roles in achieving better performance. In this study, two new topic models, Weighted Joint Sentiment-Topic (WJST) and Weighted Joint Sentiment-Topic 1 (WJST1), have been presented to extend and improve JST through two new parameters that can generate a sentiment dictionary. In the proposed methods, each word in a document affects its neighbors, and different words in the document may be affected simultaneously by several neighbor words. Therefore, proposed models consider the effect of words on each other, which, from our view, is an important factor and can increase the performance of baseline methods. Regarding evaluation results, the new parameters have an immense effect on model accuracy. While not requiring labeled data, the proposed methods are more accurate than discriminative models such as SVM and logistic regression in accordance with evaluation results. The proposed methods are simple with a low number of parameters. While providing a broad perception of connections between different words in documents of a single collection (single-domain) or multiple collections (multidomain), the proposed methods have prepared solutions for two different situations (single-domain and multidomain). WJST is suitable for multidomain datasets, and WJST1 is a version of WJST which is suitable for single-domain datasets. While being able to detect emotion at the level of the document, the proposed models improve the evaluation outcomes of the baseline approaches. Thirteen datasets with different sizes have been used in implementations. In this study, perplexity, opinion mining at the level of the document, and topic_coherency are employed for assessment. Also, a statistical test called Friedman test is used to check whether the results of the proposed models are statistically different from the results of other algorithms. As can be seen from results, the accuracy of proposed methods is above 80% for most of the datasets. WJST1 achieves the highest accuracy on Movie dataset with 97 percent, and WJST achieves the highest accuracy on Electronic dataset with 86 percent. The proposed models obtain better results compared to Adaptive Lexicon learning using Genetic Algorithm (ALGA), which employs an evolutionary approach to make an emotion dictionary. Results show that the proposed methods perform better with different topic number settings, especially for WJST1 with 97% accuracy at |Z| = 5 on the Movie dataset. Hindawi 2022-07-31 /pmc/articles/PMC9374039/ /pubmed/35965748 http://dx.doi.org/10.1155/2022/7612276 Text en Copyright © 2022 Amjad Osmani and Jamshid Bagherzadeh Mohasefi. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Osmani, Amjad
Bagherzadeh Mohasefi, Jamshid
Weighted Joint Sentiment-Topic Model for Sentiment Analysis Compared to ALGA: Adaptive Lexicon Learning Using Genetic Algorithm
title Weighted Joint Sentiment-Topic Model for Sentiment Analysis Compared to ALGA: Adaptive Lexicon Learning Using Genetic Algorithm
title_full Weighted Joint Sentiment-Topic Model for Sentiment Analysis Compared to ALGA: Adaptive Lexicon Learning Using Genetic Algorithm
title_fullStr Weighted Joint Sentiment-Topic Model for Sentiment Analysis Compared to ALGA: Adaptive Lexicon Learning Using Genetic Algorithm
title_full_unstemmed Weighted Joint Sentiment-Topic Model for Sentiment Analysis Compared to ALGA: Adaptive Lexicon Learning Using Genetic Algorithm
title_short Weighted Joint Sentiment-Topic Model for Sentiment Analysis Compared to ALGA: Adaptive Lexicon Learning Using Genetic Algorithm
title_sort weighted joint sentiment-topic model for sentiment analysis compared to alga: adaptive lexicon learning using genetic algorithm
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9374039/
https://www.ncbi.nlm.nih.gov/pubmed/35965748
http://dx.doi.org/10.1155/2022/7612276
work_keys_str_mv AT osmaniamjad weightedjointsentimenttopicmodelforsentimentanalysiscomparedtoalgaadaptivelexiconlearningusinggeneticalgorithm
AT bagherzadehmohasefijamshid weightedjointsentimenttopicmodelforsentimentanalysiscomparedtoalgaadaptivelexiconlearningusinggeneticalgorithm