Cargando…

On the Reusability of Sentiment Analysis Datasets in Applications with Dissimilar Contexts

The main goal of this paper is to evaluate the usability of several algorithms on various sentiment-labeled datasets. The process of creating good semantic vector representations for textual data is considered a very demanding task for the research community. The first and most important step of a N...

Descripción completa

Detalles Bibliográficos
Autores principales:	Sarlis, S., Maglogiannis, I.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7256387/ http://dx.doi.org/10.1007/978-3-030-49161-1_34

_version_	1783539896971952128
author	Sarlis, S. Maglogiannis, I.
author_facet	Sarlis, S. Maglogiannis, I.
author_sort	Sarlis, S.
collection	PubMed
description	The main goal of this paper is to evaluate the usability of several algorithms on various sentiment-labeled datasets. The process of creating good semantic vector representations for textual data is considered a very demanding task for the research community. The first and most important step of a Natural Language Processing (NLP) system, is text preprocessing, which greatly affects the overall accuracy of the classification algorithms. In this work, two vector space models are created, and a study consisting of a variety of algorithms, is performed on them. The work is based on the IMDb dataset which contains movie reviews along with their associated labels (positive or negative). The goal is to obtain the model with the highest accuracy and the best generalization. To measure how well these models generalize in other domains, several datasets, which are further analyzed later, are used.
format	Online Article Text
id	pubmed-7256387
institution	National Center for Biotechnology Information
language	English
publishDate	2020
record_format	MEDLINE/PubMed
spelling	pubmed-72563872020-05-29 On the Reusability of Sentiment Analysis Datasets in Applications with Dissimilar Contexts Sarlis, S. Maglogiannis, I. Artificial Intelligence Applications and Innovations Article The main goal of this paper is to evaluate the usability of several algorithms on various sentiment-labeled datasets. The process of creating good semantic vector representations for textual data is considered a very demanding task for the research community. The first and most important step of a Natural Language Processing (NLP) system, is text preprocessing, which greatly affects the overall accuracy of the classification algorithms. In this work, two vector space models are created, and a study consisting of a variety of algorithms, is performed on them. The work is based on the IMDb dataset which contains movie reviews along with their associated labels (positive or negative). The goal is to obtain the model with the highest accuracy and the best generalization. To measure how well these models generalize in other domains, several datasets, which are further analyzed later, are used. 2020-05-06 /pmc/articles/PMC7256387/ http://dx.doi.org/10.1007/978-3-030-49161-1_34 Text en © IFIP International Federation for Information Processing 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle	Article Sarlis, S. Maglogiannis, I. On the Reusability of Sentiment Analysis Datasets in Applications with Dissimilar Contexts
title	On the Reusability of Sentiment Analysis Datasets in Applications with Dissimilar Contexts
title_full	On the Reusability of Sentiment Analysis Datasets in Applications with Dissimilar Contexts
title_fullStr	On the Reusability of Sentiment Analysis Datasets in Applications with Dissimilar Contexts
title_full_unstemmed	On the Reusability of Sentiment Analysis Datasets in Applications with Dissimilar Contexts
title_short	On the Reusability of Sentiment Analysis Datasets in Applications with Dissimilar Contexts
title_sort	on the reusability of sentiment analysis datasets in applications with dissimilar contexts
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7256387/ http://dx.doi.org/10.1007/978-3-030-49161-1_34
work_keys_str_mv	AT sarliss onthereusabilityofsentimentanalysisdatasetsinapplicationswithdissimilarcontexts AT maglogiannisi onthereusabilityofsentimentanalysisdatasetsinapplicationswithdissimilarcontexts

On the Reusability of Sentiment Analysis Datasets in Applications with Dissimilar Contexts

Ejemplares similares