Cargando…
SentiCode: A new paradigm for one-time training and global prediction in multilingual sentiment analysis
The main objective of multilingual sentiment analysis is to analyze reviews regardless of the original language in which they are written. Switching from one language to another is very common on social media platforms. Analyzing these multilingual reviews is a challenge since each language is diffe...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer US
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9130974/ https://www.ncbi.nlm.nih.gov/pubmed/35645462 http://dx.doi.org/10.1007/s10844-022-00714-8 |
_version_ | 1784713085679828992 |
---|---|
author | Kanfoud, Mohamed Raouf Bouramoul, Abdelkrim |
author_facet | Kanfoud, Mohamed Raouf Bouramoul, Abdelkrim |
author_sort | Kanfoud, Mohamed Raouf |
collection | PubMed |
description | The main objective of multilingual sentiment analysis is to analyze reviews regardless of the original language in which they are written. Switching from one language to another is very common on social media platforms. Analyzing these multilingual reviews is a challenge since each language is different in terms of syntax, grammar, etc. This paper presents a new language-independent representation approach for sentiment analysis, SentiCode. Unlike previous work in multilingual sentiment analysis, the proposed approach does not rely on machine translation to bridge the gap between different languages. Instead, it exploits common features of languages, such as part-of-speech tags used in Universal Dependencies. Equally important, SentiCode enables sentiment analysis in multi-language and multi-domain environments simultaneously. Several experiments were conducted using machine/deep learning techniques to evaluate the performance of SentiCode in multilingual (English, French, German, Arabic, and Russian) and multi-domain environments. In addition, the vocabulary proposed by SentiCode and the effect of each token were evaluated by the ablation method. The results highlight the 70% accuracy of SentiCode, with the best trade-off between efficiency and computing time (training and testing) in a total of about 0.67 seconds, which is very convenient for real-time applications. |
format | Online Article Text |
id | pubmed-9130974 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Springer US |
record_format | MEDLINE/PubMed |
spelling | pubmed-91309742022-05-25 SentiCode: A new paradigm for one-time training and global prediction in multilingual sentiment analysis Kanfoud, Mohamed Raouf Bouramoul, Abdelkrim J Intell Inf Syst Article The main objective of multilingual sentiment analysis is to analyze reviews regardless of the original language in which they are written. Switching from one language to another is very common on social media platforms. Analyzing these multilingual reviews is a challenge since each language is different in terms of syntax, grammar, etc. This paper presents a new language-independent representation approach for sentiment analysis, SentiCode. Unlike previous work in multilingual sentiment analysis, the proposed approach does not rely on machine translation to bridge the gap between different languages. Instead, it exploits common features of languages, such as part-of-speech tags used in Universal Dependencies. Equally important, SentiCode enables sentiment analysis in multi-language and multi-domain environments simultaneously. Several experiments were conducted using machine/deep learning techniques to evaluate the performance of SentiCode in multilingual (English, French, German, Arabic, and Russian) and multi-domain environments. In addition, the vocabulary proposed by SentiCode and the effect of each token were evaluated by the ablation method. The results highlight the 70% accuracy of SentiCode, with the best trade-off between efficiency and computing time (training and testing) in a total of about 0.67 seconds, which is very convenient for real-time applications. Springer US 2022-05-25 2022 /pmc/articles/PMC9130974/ /pubmed/35645462 http://dx.doi.org/10.1007/s10844-022-00714-8 Text en © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Article Kanfoud, Mohamed Raouf Bouramoul, Abdelkrim SentiCode: A new paradigm for one-time training and global prediction in multilingual sentiment analysis |
title | SentiCode: A new paradigm for one-time training and global prediction in multilingual sentiment analysis |
title_full | SentiCode: A new paradigm for one-time training and global prediction in multilingual sentiment analysis |
title_fullStr | SentiCode: A new paradigm for one-time training and global prediction in multilingual sentiment analysis |
title_full_unstemmed | SentiCode: A new paradigm for one-time training and global prediction in multilingual sentiment analysis |
title_short | SentiCode: A new paradigm for one-time training and global prediction in multilingual sentiment analysis |
title_sort | senticode: a new paradigm for one-time training and global prediction in multilingual sentiment analysis |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9130974/ https://www.ncbi.nlm.nih.gov/pubmed/35645462 http://dx.doi.org/10.1007/s10844-022-00714-8 |
work_keys_str_mv | AT kanfoudmohamedraouf senticodeanewparadigmforonetimetrainingandglobalpredictioninmultilingualsentimentanalysis AT bouramoulabdelkrim senticodeanewparadigmforonetimetrainingandglobalpredictioninmultilingualsentimentanalysis |