Cargando…

An aspect-level sentiment analysis dataset for therapies on Twitter

The dataset described is an aspect-level sentiment analysis dataset for therapies, including medication, behavioral and other therapies, created by leveraging user-generated text from Twitter. The dataset was constructed by collecting Twitter posts using keywords associated with the therapies (often...

Descripción completa

Detalles Bibliográficos
Autores principales: Guo, Yuting, Das, Sudeshna, Lakamana, Sahithi, Sarker, Abeed
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10558704/
https://www.ncbi.nlm.nih.gov/pubmed/37808542
http://dx.doi.org/10.1016/j.dib.2023.109618
_version_ 1785117336904138752
author Guo, Yuting
Das, Sudeshna
Lakamana, Sahithi
Sarker, Abeed
author_facet Guo, Yuting
Das, Sudeshna
Lakamana, Sahithi
Sarker, Abeed
author_sort Guo, Yuting
collection PubMed
description The dataset described is an aspect-level sentiment analysis dataset for therapies, including medication, behavioral and other therapies, created by leveraging user-generated text from Twitter. The dataset was constructed by collecting Twitter posts using keywords associated with the therapies (often referred to as treatments). Subsequently, subsets of the collected posts were manually reviewed, and annotation guidelines were developed to categorize the posts as positive, negative, or neutral. The dataset contains a total of 5364 posts mentioning 32 therapies. These posts are further categorized manually into 998 (18.6%) positive, 619 (11.5%) negatives, and 3747 (69.9%) neutral sentiments. The inter-annotation agreement for the dataset was evaluated using Cohen's Kappa score, achieving an 0.82 score. The potential use of this dataset lies in the development of automatic systems that can detect users' sentiments toward therapies based on their posts. While there are other sentiment analysis datasets available, this is the first that encodes sentiments associated with specific therapies. Researchers and developers can utilize this dataset to train sentiment analysis models, natural language processing algorithms, or machine learning systems to accurately identify and analyze the sentiments expressed by consumers on social media platforms like Twitter.
format Online
Article
Text
id pubmed-10558704
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-105587042023-10-08 An aspect-level sentiment analysis dataset for therapies on Twitter Guo, Yuting Das, Sudeshna Lakamana, Sahithi Sarker, Abeed Data Brief Data Article The dataset described is an aspect-level sentiment analysis dataset for therapies, including medication, behavioral and other therapies, created by leveraging user-generated text from Twitter. The dataset was constructed by collecting Twitter posts using keywords associated with the therapies (often referred to as treatments). Subsequently, subsets of the collected posts were manually reviewed, and annotation guidelines were developed to categorize the posts as positive, negative, or neutral. The dataset contains a total of 5364 posts mentioning 32 therapies. These posts are further categorized manually into 998 (18.6%) positive, 619 (11.5%) negatives, and 3747 (69.9%) neutral sentiments. The inter-annotation agreement for the dataset was evaluated using Cohen's Kappa score, achieving an 0.82 score. The potential use of this dataset lies in the development of automatic systems that can detect users' sentiments toward therapies based on their posts. While there are other sentiment analysis datasets available, this is the first that encodes sentiments associated with specific therapies. Researchers and developers can utilize this dataset to train sentiment analysis models, natural language processing algorithms, or machine learning systems to accurately identify and analyze the sentiments expressed by consumers on social media platforms like Twitter. Elsevier 2023-09-23 /pmc/articles/PMC10558704/ /pubmed/37808542 http://dx.doi.org/10.1016/j.dib.2023.109618 Text en © 2023 The Author(s) https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Data Article
Guo, Yuting
Das, Sudeshna
Lakamana, Sahithi
Sarker, Abeed
An aspect-level sentiment analysis dataset for therapies on Twitter
title An aspect-level sentiment analysis dataset for therapies on Twitter
title_full An aspect-level sentiment analysis dataset for therapies on Twitter
title_fullStr An aspect-level sentiment analysis dataset for therapies on Twitter
title_full_unstemmed An aspect-level sentiment analysis dataset for therapies on Twitter
title_short An aspect-level sentiment analysis dataset for therapies on Twitter
title_sort aspect-level sentiment analysis dataset for therapies on twitter
topic Data Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10558704/
https://www.ncbi.nlm.nih.gov/pubmed/37808542
http://dx.doi.org/10.1016/j.dib.2023.109618
work_keys_str_mv AT guoyuting anaspectlevelsentimentanalysisdatasetfortherapiesontwitter
AT dassudeshna anaspectlevelsentimentanalysisdatasetfortherapiesontwitter
AT lakamanasahithi anaspectlevelsentimentanalysisdatasetfortherapiesontwitter
AT sarkerabeed anaspectlevelsentimentanalysisdatasetfortherapiesontwitter
AT guoyuting aspectlevelsentimentanalysisdatasetfortherapiesontwitter
AT dassudeshna aspectlevelsentimentanalysisdatasetfortherapiesontwitter
AT lakamanasahithi aspectlevelsentimentanalysisdatasetfortherapiesontwitter
AT sarkerabeed aspectlevelsentimentanalysisdatasetfortherapiesontwitter