Cargando…

SentiCode: A new paradigm for one-time training and global prediction in multilingual sentiment analysis

The main objective of multilingual sentiment analysis is to analyze reviews regardless of the original language in which they are written. Switching from one language to another is very common on social media platforms. Analyzing these multilingual reviews is a challenge since each language is diffe...

Descripción completa

Detalles Bibliográficos
Autores principales: Kanfoud, Mohamed Raouf, Bouramoul, Abdelkrim
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer US 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9130974/
https://www.ncbi.nlm.nih.gov/pubmed/35645462
http://dx.doi.org/10.1007/s10844-022-00714-8
_version_ 1784713085679828992
author Kanfoud, Mohamed Raouf
Bouramoul, Abdelkrim
author_facet Kanfoud, Mohamed Raouf
Bouramoul, Abdelkrim
author_sort Kanfoud, Mohamed Raouf
collection PubMed
description The main objective of multilingual sentiment analysis is to analyze reviews regardless of the original language in which they are written. Switching from one language to another is very common on social media platforms. Analyzing these multilingual reviews is a challenge since each language is different in terms of syntax, grammar, etc. This paper presents a new language-independent representation approach for sentiment analysis, SentiCode. Unlike previous work in multilingual sentiment analysis, the proposed approach does not rely on machine translation to bridge the gap between different languages. Instead, it exploits common features of languages, such as part-of-speech tags used in Universal Dependencies. Equally important, SentiCode enables sentiment analysis in multi-language and multi-domain environments simultaneously. Several experiments were conducted using machine/deep learning techniques to evaluate the performance of SentiCode in multilingual (English, French, German, Arabic, and Russian) and multi-domain environments. In addition, the vocabulary proposed by SentiCode and the effect of each token were evaluated by the ablation method. The results highlight the 70% accuracy of SentiCode, with the best trade-off between efficiency and computing time (training and testing) in a total of about 0.67 seconds, which is very convenient for real-time applications.
format Online
Article
Text
id pubmed-9130974
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer US
record_format MEDLINE/PubMed
spelling pubmed-91309742022-05-25 SentiCode: A new paradigm for one-time training and global prediction in multilingual sentiment analysis Kanfoud, Mohamed Raouf Bouramoul, Abdelkrim J Intell Inf Syst Article The main objective of multilingual sentiment analysis is to analyze reviews regardless of the original language in which they are written. Switching from one language to another is very common on social media platforms. Analyzing these multilingual reviews is a challenge since each language is different in terms of syntax, grammar, etc. This paper presents a new language-independent representation approach for sentiment analysis, SentiCode. Unlike previous work in multilingual sentiment analysis, the proposed approach does not rely on machine translation to bridge the gap between different languages. Instead, it exploits common features of languages, such as part-of-speech tags used in Universal Dependencies. Equally important, SentiCode enables sentiment analysis in multi-language and multi-domain environments simultaneously. Several experiments were conducted using machine/deep learning techniques to evaluate the performance of SentiCode in multilingual (English, French, German, Arabic, and Russian) and multi-domain environments. In addition, the vocabulary proposed by SentiCode and the effect of each token were evaluated by the ablation method. The results highlight the 70% accuracy of SentiCode, with the best trade-off between efficiency and computing time (training and testing) in a total of about 0.67 seconds, which is very convenient for real-time applications. Springer US 2022-05-25 2022 /pmc/articles/PMC9130974/ /pubmed/35645462 http://dx.doi.org/10.1007/s10844-022-00714-8 Text en © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Article
Kanfoud, Mohamed Raouf
Bouramoul, Abdelkrim
SentiCode: A new paradigm for one-time training and global prediction in multilingual sentiment analysis
title SentiCode: A new paradigm for one-time training and global prediction in multilingual sentiment analysis
title_full SentiCode: A new paradigm for one-time training and global prediction in multilingual sentiment analysis
title_fullStr SentiCode: A new paradigm for one-time training and global prediction in multilingual sentiment analysis
title_full_unstemmed SentiCode: A new paradigm for one-time training and global prediction in multilingual sentiment analysis
title_short SentiCode: A new paradigm for one-time training and global prediction in multilingual sentiment analysis
title_sort senticode: a new paradigm for one-time training and global prediction in multilingual sentiment analysis
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9130974/
https://www.ncbi.nlm.nih.gov/pubmed/35645462
http://dx.doi.org/10.1007/s10844-022-00714-8
work_keys_str_mv AT kanfoudmohamedraouf senticodeanewparadigmforonetimetrainingandglobalpredictioninmultilingualsentimentanalysis
AT bouramoulabdelkrim senticodeanewparadigmforonetimetrainingandglobalpredictioninmultilingualsentimentanalysis