Cargando…

A versatile framework for resource-limited sentiment articulation, annotation, and analysis of short texts

Choosing a comprehensive and cost-effective way of articulating and annotating the sentiment of a text is not a trivial task, particularly when dealing with short texts, in which sentiment can be expressed through a wide variety of linguistic and rhetorical phenomena. This problem is especially cons...

Descripción completa

Detalles Bibliográficos
Autores principales: Batanović, Vuk, Cvetanović, Miloš, Nikolić, Boško
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7660500/
https://www.ncbi.nlm.nih.gov/pubmed/33180825
http://dx.doi.org/10.1371/journal.pone.0242050
_version_ 1783609017588776960
author Batanović, Vuk
Cvetanović, Miloš
Nikolić, Boško
author_facet Batanović, Vuk
Cvetanović, Miloš
Nikolić, Boško
author_sort Batanović, Vuk
collection PubMed
description Choosing a comprehensive and cost-effective way of articulating and annotating the sentiment of a text is not a trivial task, particularly when dealing with short texts, in which sentiment can be expressed through a wide variety of linguistic and rhetorical phenomena. This problem is especially conspicuous in resource-limited settings and languages, where design options are restricted either in terms of manpower and financial means required to produce appropriate sentiment analysis resources, or in terms of available language tools, or both. In this paper, we present a versatile approach to addressing this issue, based on multiple interpretations of sentiment labels that encode information regarding the polarity, subjectivity, and ambiguity of a text, as well as the presence of sarcasm or a mixture of sentiments. We demonstrate its use on Serbian, a resource-limited language, via the creation of a main sentiment analysis dataset focused on movie comments, and two smaller datasets belonging to the movie and book domains. In addition to measuring the quality of the annotation process, we propose a novel metric to validate its cost-effectiveness. Finally, the practicality of our approach is further validated by training, evaluating, and determining the optimal configurations of several different kinds of machine-learning models on a range of sentiment classification tasks using the produced dataset.
format Online
Article
Text
id pubmed-7660500
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-76605002020-11-18 A versatile framework for resource-limited sentiment articulation, annotation, and analysis of short texts Batanović, Vuk Cvetanović, Miloš Nikolić, Boško PLoS One Research Article Choosing a comprehensive and cost-effective way of articulating and annotating the sentiment of a text is not a trivial task, particularly when dealing with short texts, in which sentiment can be expressed through a wide variety of linguistic and rhetorical phenomena. This problem is especially conspicuous in resource-limited settings and languages, where design options are restricted either in terms of manpower and financial means required to produce appropriate sentiment analysis resources, or in terms of available language tools, or both. In this paper, we present a versatile approach to addressing this issue, based on multiple interpretations of sentiment labels that encode information regarding the polarity, subjectivity, and ambiguity of a text, as well as the presence of sarcasm or a mixture of sentiments. We demonstrate its use on Serbian, a resource-limited language, via the creation of a main sentiment analysis dataset focused on movie comments, and two smaller datasets belonging to the movie and book domains. In addition to measuring the quality of the annotation process, we propose a novel metric to validate its cost-effectiveness. Finally, the practicality of our approach is further validated by training, evaluating, and determining the optimal configurations of several different kinds of machine-learning models on a range of sentiment classification tasks using the produced dataset. Public Library of Science 2020-11-12 /pmc/articles/PMC7660500/ /pubmed/33180825 http://dx.doi.org/10.1371/journal.pone.0242050 Text en © 2020 Batanović et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Batanović, Vuk
Cvetanović, Miloš
Nikolić, Boško
A versatile framework for resource-limited sentiment articulation, annotation, and analysis of short texts
title A versatile framework for resource-limited sentiment articulation, annotation, and analysis of short texts
title_full A versatile framework for resource-limited sentiment articulation, annotation, and analysis of short texts
title_fullStr A versatile framework for resource-limited sentiment articulation, annotation, and analysis of short texts
title_full_unstemmed A versatile framework for resource-limited sentiment articulation, annotation, and analysis of short texts
title_short A versatile framework for resource-limited sentiment articulation, annotation, and analysis of short texts
title_sort versatile framework for resource-limited sentiment articulation, annotation, and analysis of short texts
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7660500/
https://www.ncbi.nlm.nih.gov/pubmed/33180825
http://dx.doi.org/10.1371/journal.pone.0242050
work_keys_str_mv AT batanovicvuk aversatileframeworkforresourcelimitedsentimentarticulationannotationandanalysisofshorttexts
AT cvetanovicmilos aversatileframeworkforresourcelimitedsentimentarticulationannotationandanalysisofshorttexts
AT nikolicbosko aversatileframeworkforresourcelimitedsentimentarticulationannotationandanalysisofshorttexts
AT batanovicvuk versatileframeworkforresourcelimitedsentimentarticulationannotationandanalysisofshorttexts
AT cvetanovicmilos versatileframeworkforresourcelimitedsentimentarticulationannotationandanalysisofshorttexts
AT nikolicbosko versatileframeworkforresourcelimitedsentimentarticulationannotationandanalysisofshorttexts